Legacy code - when to move on - legacy

My team and support a large number of legacy applications all of which are currently functional but problematic to support and maintain. They all depend on code that the compiler manufacture has officially no support for.
So the question is should we leave the code as is, and risk a new compiler breaking our code, or should we bite the bullet and update all the code?

The answer is totally dependant on the resources your employer (or yourself) can afford to make the refactoring (or even totally rewrite big parts).
So you should first estimate how much time/developers you can afford to refactoring the application, then see if you think it'll be enough.
If you can afford time and people, then do it, don't hesitate! You're investing in the future by reducing the time to debug the application so it will be helpful and less expensive once the refactoring is done.

It depends on the nature of the applications, just how big and important they are, as well as the programming culture at your workplace, and the resources available to you.
If the applications are valuable enough to you that they are worth the trouble, and you have the necessary resources, then do the update. Don't let the problem persist.
If they are not valuable enough to be worth a full-scale update effort, or appropriate resources are not at hand, perhaps work on updating one at a time if possible.
Just some suggestions, but again this greatly depends on you and your organization.

It sounds like you have a large technical debt. This debt is only going to increase unless you do something. Both things you mentioned are options, and risky, but long term it's a risk you need to take.
Using an updated compiler just means you need to update the code to work in the new compiler. Something is bound to break, but then refactor the parts that break. This allows you to migrate.
The other option is to update your entire code base. This takes time, during which you need to maintain 2 copies of the code, or freeze the old version. Freezing the old version is probably not an option.
I would recommend using an updated compiler and fixing what breaks. This allows you to add features, while refactoring and fixing the current codebase.

Rewriting the code can be an useful step for you company for many reasons:
you can use a new compiler and a more recent platform
you can refactor the code deleting its weaknesses
you can motivate your people because developing new code is better than correct bugs in an old one.
Why don't you start that activity with a small number of people, beginning from the most common parts of the code? You can group them into a dll and use it also for future projects.

Related

Disagreement on software time estimation

How do you deal with a client who has different time estimates for the software product than yours?
I am going to describe a scenario that is not mine, but that captures broadly the same problem. I am working as a subcontractor to a large company that has a programming department. The software project we are working on is in an area that the department believe they have a handle on, but because their expertise and mine are very different we tend to get different results.
Example: At the start of the project I suggested one way of development which they rubbished as being unrealistically difficult and suggested integrating a different framework (one they are familiar with) with the programming language we are using (Python) to get more or less the same result.
Their estimate for this integration: less than a week (they haven't done the integration before).
My estimate for the integration: above two weeks.
Using my suggested way to get the result needed (including using matplotlib among other libraries used elsewhere within the project): 45 minutes. This is not an estimate, the bit was actually finished in 45 minutes.
Example: for the software to be integrated with their internal system, they needed to provide a web service for me to use. They provided a broken one, though it does work with their internal tool (doesn't work with .Net or Java mainstream packages among other options). They maintain that it is my fault that the integration has taken longer than the time estimated.
The problem is not that they don't know, the problem is that they have enough knowledge about programming to be dangerous (in my opinion). Is there some guidelines for how to deal with this type of situation? A way for expectation management? Or may be I shouldn't get involved in such projects from the start and in this case what are the telltale signs?
If a client isn't happy with a time estimate, don't do the work. If they think they can do it better or faster, tell them to go ahead.
The one thing I never allow is for my estimates to be modified. That's something that caught me out early on in my career but we learn our lessons.
If clients were so good at doing the work, they wouldn't be hiring me. I'd simply point out that they hired me for my expertise so why are they disregarding that expertise. Of course, if they were to allow the scope of the project to change (i.e., less work), that would be another matter, and one up for discussion.
If you didn't lock in exactly what they were meant to provide as part of the deal, then it's a "he says, she says" situation and, unfortunately, the customer controls the purse strings. However, often, the greatest power you can have is the ability to just walk away.
No-one says you have to do the job.
Of course, all that advice above is worth every cent you paid for it :-)
I don't know your specific circumstances.
Or may be I shouldn't get involved in such projects from the start and in this case what are the telltale signs?
My answer for sure. If you can avoid those projects, do it.
Some signs : people thinking they know how to do things when you can guess they can't. The "oh no let's not use this perfectly suitable tool because I don't know it" is a major indicator that the person is technically challenged.
first of all, it is no fun to be in such an environment.
So, if you like to have fun at your job, and you do not need to take this job for extenuating financial reasons, then simply do not take the job that is not fun.
Since that is hardly realistic in many cases, you will end up with the job and need to manage the situation as best you can. One way is to make sure there is a paper trail documenting your objections and concerns with the plan. Try not to be overtly negative, but try to be constructive and present valid alternatives. Here you will need to feel out the political landscape, determine if the 'boss' will be appreciative or threatened by your commentary, and act accordingly.
Many times there are other issues that management is dealing with that you are not aware of. Be cautious of this fact, and maybe ask the management team if this is the case, again without being condescending or negative.
Finally, if you have alternatives that take less time than the meetings it would take to discuss them, just try it in a sandbox, and show it off. This would go a long way to 'proving' your points. Caution here is that you could be accused of not being a team player, or of wasting resources, or not following direction. Make sure this is mitigated by doing these types of things on your own time, or after careful consideration of how long you are spending on these things as well as how vested your boss seems to be on the alternatives.
hth
I ran into the same problem with integration. Example: for the
software to be integrated with their internal system, they needed to
provide a web service for me to use...They maintain that it is my
fault that the integration has taken longer than the time estimated.
Wow very similar to what I was experiencing with a client. The best thing I can suggest is to keep good documentation. In the end that is what saved me. When it came to finger pointing I had all of the emails and facts in order and was prepared to defend my self.
One thing I would suggest is to separate out a target/goal and an estimation. I would not change my estimate unless it involved actually removing features or something is revealed that would make it easier. Tell them you will try to hit the target in anyway you can and you care about the business goal. However, your estimate will not change. If its getting no where and they are just dense then smile and nod and take it if its the only gig around.
Was just writing about this in my blog
How to estimate the WRONG way

How do you refactor a large messy codebase?

I have a big mess of code. Admittedly, I wrote it myself - a year ago. It's not well commented but it's not very complicated either, so I can understand it -- just not well enough to know where to start as far as refactoring it.
I violated every rule that I have read about over the past year. There are classes with multiple responsibilities, there are indirect accesses (I forget the term - something like foo.bar.doSomething()), and like I said it is not well commented. On top of that, it's the beginnings of a game, so the graphics is coupled with the data, or the places where I tried to decouple graphics and data, I made the data public in order for the graphics to be able to access the data it needs...
It's a huge mess! Where do I start? How would you start on something like this?
My current approach is to take variables and switch them to private and then refactor the pieces that break, but that doesn't seem to be enough. Please suggest other strategies for wading through this mess and turning it into something clean so that I can continue where I left off!
Update two days later: I have been drawing out UML-like diagrams of my classes, and catching some of the "Low Hanging Fruit" along the way. I've even found some bits of code that were the beginnings of new features, but as I'm trying to slim everything down, I've been able to delete those bits and make the project feel cleaner. I'm probably going to refactor as much as possible before rigging my test cases (but only the things that are 100% certain not to impact the functionality, of course!), so that I won't have to refactor test cases as I change functionality. (do you think I'm doing it right or would it, in your opinion, be easier for me to suck it up and write the tests first?)
Please vote for the best answer so that I can mark it fairly! Feel free to add your own answer to the bunch as well, there's still room for you! I'll give it another day or so and then probably mark the highest-voted answer as accepted.
Thanks to everyone who has responded so far!
June 25, 2010: I discovered a blog post which directly answers this question from someone who seems to have a pretty good grasp of programming: (or maybe not, if you read his article :) )
To that end, I do four things when I
need to refactor code:
Determine what the purpose of the code was
Draw UML and action diagrams of the classes involved
Shop around for the right design patterns
Determine clearer names for the current classes and methods
Pick yourself up a copy of Martin Fowler's Refactoring. It has some good advice on ways to break down your refactoring problem. About 75% of the book is little cookbook-style refactoring steps you can do. It also advocates automated unit tests that you can run after each step to prove your code still works.
As for a place to start, I would sit down and draw out a high-level architecture of your program. You don't have to get fancy with detailed UML models, but some basic UML is not a bad idea. You need a big picture idea of how the major pieces fit together so you can visually see where your decoupling is going to happen. Just a page or two of some basic block diagrams will help with the overwhelming feeling you have right now.
Without some sort of high level spec or design, you just risk getting lost again and ending up with another unmaintainable mess.
If you need to start from scratch, remember that you never truly start from scratch. You have some code and the knowledge you gained from your first time. But sometimes it does help to start with a blank project and pull things in as you go, rather than put out fires in a messy code base. Just remember not to completely throw out the old, use it for its good parts and pull them in as you go.
What was most important for me on different occasions were unit tests: I took a few days to write tests for the old code and then I was free to refactor with confidence. How exactly is a different question, but having the tests made it possible for me to make real, substantial changes to the code.
I'll second everyone's recommendations for Fowler's Refactoring, but in your specific case you may want to look at Michael Feathers' Working Effectively with Legacy Code, which is really perfect for your situation.
Feathers talks about Characterization Tests, which are unit tests not to assert known behaviour of the system but to explore and define the existing (unclear) behaviour -- in the case where you've written your own legacy code, and fixing it yourself, this may not be so important, but if your design is sloppy then it's quite possible there are parts of the code that work by 'magic' and their behaviour isn't clear, even to you -- in that case, characterization tests will help.
One great part of the book is the discussion about finding (or creating) seams in your codebase -- seams are natural 'fault lines', if you like, where you can break into the existing system to start testing it, and pulling it towards a better design. Hard to explain but well worth a read.
There's a brief paper where Feathers fleshes out some of the concepts from the book, but it really is well worth hunting down the whole thing. It's one of my favourites.
Just an additional refactoring that is more important than you think: Name things correctly!
This goes for any variable name and method name. If the name does not accurately reflect what the thing is used for, then rename it to something more accurate. This might require several iterations. If you cannot find a short, and entirely accurate name, then that item does too much and you have an excellent candidate for a code snippet that needs to be split. The names also clearly indicate where the cuts are to be made.
Also, document your stuff. Whenever the answer to WHY? is not clearly conveyed by the answer to HOW? (being the code) you will need to add some documentation. Capturing design decisions is probably the most important task as it is very hard to do in code.
You could always start from "scratch". That doesn't mean scrap it and start from nothing, but try to rethink high-level things from the beginning, since you seem to have learned a lot since the last time you worked on it.
Start from a higher level, and as you build the scaffolding of your new and improved structure, take all the code you can reuse, which will probably be more than you think if you're willing to read through it and make some small changes.
When you're making the changes, be sure to be strict with yourself about following all the good practices you now know, because you will really thank yourself later.
It can be surprisingly refreshing to properly re-make program to do exactly what it did before, only more "cleanly". ;)
As others have mentioned as well, unit-tests are your best friend! They help you ensure that your refactoring works, and if you're starting from "scratch", it's the perfect time to write them.
You're in a much better position than many people facing this problem in that you understand what the code is supposed to do.
Taking variables out of a shared scope, as you're doing, is a great start, in that you're partitioning responsibilities. Ultimately you want each class to express a single responsibility. A few other things you might look at:
Easy targets for refactoring are code that's duplicated in lots of places and long methods.
If you're managing application state through statically initialized singletons or worse, a global state that everything is talking to, consider moving it to a managed initialization system (i.e. a dependency injection framework like spring or guice) or at least make sure that the initialization isn't entangled with the rest of the code.
Centralize and standardize how you're accessing outside resources, especially if you've got things like file locations or urls hardcoded.
Buy an IDE that has good refactoring support. I think IntelliJ is the best, but Eclipse has it now, too.
The unit test idea is key as well. You will want to have a suite of large, overall transactions that will give you the overall behavior of the code.
Once you have those, start creating unit tests for classes and smaller packages. Write the tests to demonstrate proper behavior, make your changes, and re-run the tests to demonstrate that you haven't broken everything.
Track code coverage as you go. You'll want to work it up to 70% or better. For the classes you change, you'll want those to be 70% or better before you make your changes.
Build up that safety net over time and you'll be able to refactor with some confidence.
very slowly :D
No seriously... take it one step at a time. For instance, refactor something only if it affects or helps you write the current bug/feature that you are working on right now and do no more than that. And before you refactor make darn sure that you have some kind of automated test in place that gets run on each build that will actually test what you are writing/refactoring. Even if you don't have unit tests, it is never too late to start adding them for all new and modified code that is being written. Over time, your code base will get better in small increments daily or weekly instead of worse - all without you making monumental heaps of changes.
In my personal opinion and experience, it's not worth it to just refactor a (legacy) codebase en masse for the sake of refactoring. In those cases, it's best to just start from scratch and do it right all over again (and very rarely are you afforded the opportunity to do such a thing). Hence, just refactoring incremental is the way to go.
For Java code, my favorite first step is to run Findbugs and then remove all the dead stores, un-used fields, unreachable catch blocks, unused private methods and likely bugs.
Next I run CPD to look for evidence of cut-copy-paste code.
It isn't unusual to be able to reduce the code base by 5% by doing this. It also saves you from refactoring code that is never used.
I think you should use Eclipse as a IDE because it is having many plugins and free of cost.You should now follow the MVC pattern and yes must write test cases using JUnit.Eclipse also have plugin for JUnit and it is providing code refactoring facility too so that will reduce your some work.And always remember that writing a code is not important the main thing is to write clean code.So now give comments everywhere so that not only you but any other person read the code then while reading the code he must feel that he is reading an essay.
Refactor the low-hanging fruit. Nibble away at the easy bits, and as you do that, the harder bits will begin to be a little easier. When there aren't any bits left to refactor, you're done.
The refactorings you'll probably find most useful are Rename Method (and even more trivial Renamings like Field, Variable, and Parameter), Extract Method, and Extract Class. For each refactoring you perform, write the necessary unit tests to make the refactoring safe, and run the entire suite of unit tests after each refactoring. It's tempting - and, let's be honest, pretty safe - to rely on the automated refactorings of your IDE, without the tests - but it's good practice and will be good to have the tests into the future as you add functionality to your project.
You might want to look at Martin Fowler's book Refactoring. This is the book that popularized the term and technique (my thought when taking his course: "I've been doing a lot of this all along, I didn't know it had a name"). A quote from the link:
Refactoring is a controlled technique
for improving the design of an
existing code base. Its essence is
applying a series of small
behavior-preserving transformations,
each of which "too small to be worth
doing". However the cumulative effect
of each of these transformations is
quite significant. By doing them in
small steps you reduce the risk of
introducing errors. You also avoid
having the system broken while you are
carrying out the restructuring - which
allows you to gradually refactor a
system over an extended period of
time.
As others have pointed out, unit tests will allow you to refactor with confidence. And start by reducing code duplication. The book will give you lots of other insights.
Here is a catalog of refactorings.
The correct definition of messy code, is code that hard to maintain and change.
To use more mathematical definition, you can check your code by code metrics tools.
This way, you will keep the code that already good enough, and find very fast, the wrong code.
My experience say, that is very powerful way to improve the quality of your code. (if your tool can show you the result on each build or on realtime)
Throw it away, build it new.

What are the disadvantages code reuse?

A few years ago, we needed a C++ IPC library for making function calls over TCP. We chose one and used it in our application. After a while, it became clear it didn't provide all functionality we needed. In the next version of our software, we threw the third party IPC library out and replaced it by one we wrote ourselves. From then on, I sometimes doubt whether this was a good decision, because it has proven to be quite a lot of work and it obviously felt like reinventing the wheel. So my question is: are there disadvantages to code reuse that justify this reinvention?
I can suggest a few
The bugs get replicated - If you reuse a buggy code :)
Sometimes it may add an additional overhead. As an example if you just need to do a simple thing it is not advisable to use a complex BIG library that implements the required feature.
You might face with some licensing concerns.
You may need to spend some time to learn\configure the external library. This may not be effective if the re-development takes a much lower time.
Reusing a poorly documented library may get more time than expected/estimated
P.S. The reasons for writing our own library were:
Evaluating external libraries is often very difficult and it takes a lot of time. Also, some problems only become visible after a thorough evaluation.
It made it possible to introduce some features that are specific for our project.
It is easier to do maintenance and to write extensions, as you know the library through and through.
It's pretty much always case by case. You have to look at the suitability and quality of what you're trying to reuse.
The number one issue is: you can only successfully reuse code if that code is GOOD code. If it was designed poorly, has bugs, or is very fragile then you'll run into the same issues you already did run into -- you have to go do it yourself anyway because it's so hard to modify the existing code.
However, if it's a third-party library that you are considering using that you don't have the source code for, it's a little different. You can try and get the source if it's that kind of library. Some commercial library vendors are open to suggestions and feature requests.
The Golden Wisdom :: It Has To Be Usable Before It Can Be Reusable.
The biggest disadvantage (you mention it yourself) by reusing third party libraries, is that you are strongly coupled and dependent to how that library works and how it's supposed to be used, unless you manage to create a middle interface layer that can take care of it.
But it's hard to create a generic interface, since replacing an existing library with another one, more or less requires that the new functionality works in similar ways. However, you can always rewrite the code using it, but that might be very hard and take a long time.
Another aspect is that if you reinvent the wheel, you have complete control over what's happening and you can do modifications as you see fit. This can be completely impossible if you are depending on a third part library being alive and constantly providing you with updates and bug fixes. On the other hand, reusing code this way enables you to focus on other things in your software, which sometimes might be the thing to do.
There's always a trade off.
If your code relies on external resources and those go away, you may be crippling portions of many applications.
Since most reused code comes from the internet, you run into all the issues with the Bathroom Wall of Code Atwood talks about. You can run into issues with insecure or unreliable borrowed code, and the more black boxed it is, the worse.
Disadvantages of code reuse:
Debugging takes a whole lot longer since it's not your code and it's likely that it's somewhat bloated code.
Any specific requirements will also take more work since you are constrained by the code you're re-using and have to work around it's limitations.
Constant code reuse will result in the long run in a bloated and disorganized applications with hard to chase bugs - programming hell.
Re-using code can (dependently on the case) reduce the challenge and satisfaction factor for the programmer, and also waste an opportunity to develop new skills.
It depends on the case, the language and the code you want to re-use or re-write. In general I believe that the higher-level the language is, the more I tend towards code reuse. Bugs in higher-level language can have a bigger impact, and they're easier to rewrite. High level code must stay readable, neat and flexible. Of course that could be said of all code, but, somehow, rewriting a C library sounds less of a good idea than rewriting (or rather re-factoring) PHP model code.
So anyway, these are some of the arguments I'd use to promote "reinventing the wheel".
Sometimes it's just faster, more fun, and better in the long run to rewrite from scratch than having to work around bugs and limitation of a current codebase.
Wondering what you are using to keep this library you reinvented?
Initial time for create a reusable code is more expensive and time cost
When master branch has an update you need to sync it and deploy again
The bugs get replicated - If you reuse a buggy code
Reusing a poorly documented code may get more time than expected/estimated

Maintaining code that is close to software rot

Let's say you're the lucky programmer who inherited a code that is close to software rot. Software rot as defined in Pragmatic Programmer is code that is too ugly (in this case, unrefactored code) it is compared to a broken window that no one wants to fix it and in turn can damage a house and can cause criminals to thrive in that city.
But it is the same code that Joel Spolsky in JoelOnSoftware values in such a way that it contains valuable patches which have been debugged throughout its lifetime (which can look unstructured and ugly).
How would you maintain this?
Have a look at Working Effectively with Legacy Code by Michael Feathers. Lots of good advice there.
Welc is a great book. You should certainly check it out.
If you don't want to wait for the book arrive, I can summarise the bits I think are important
You need to understand your system. Do some throwaway coding to understand the part you need to work on. E.g. be prepared to try and do some work to get the system under test based upon the knowledge that you will probably break it. (understand what went wrong)
Look for areas where you can break dependencies. Michael Feathers calls these seams. They are points where you can take abit of the legacy system and refactor it so it will be testable.
As you work on the system add tests as you go.
You can do a few things:
Refactor the code to make it more maintainable. If the code is being used for feature development as well then refactoring will make sense.
If the code is legacy code and is touched only for bug fixes then I would suggest you only fix as much as required and when required.
Often, the first impression people get from such legacy acquired code is that its messy. Give it some time and get comfortable with it. You may see some valid reasons as to why the code looks this way with time to come...
First, make sure that you have a robust test procedure for it, and that it will actually be tested again in depth, by several people (you, QA, ...).
Then, take some time, day after day, to improve the small parts you have to modify. The key is to have a management that understands "why it takes longer as expected". Explain that you have to do refactoring and that it is important for both short and long term, ask other developers to review the existing code and confirm your arguments.

What makes code legacy?

I have heard many developers refer to code as "legacy". Most of the time it is code that has been written by someone who no longer works on the project. What is it that makes code, legacy code?
Update in response to:
"Something handed down from an ancestor or a predecessor or from the past" http://www.thefreedictionary.com/legacy. Clearly you wanted to know something else. Could you clarify or expand your question? S.Lott
I am looking for the symptoms of legacy code that make it unusable or a nightmare to work with. When is it better to throw it away? It is my opinion that code should be thrown away more often and that reinventing the wheel is valuable part of development. The academic ideal of not reinventing the wheel is a nice one but it is not very practical.
On the other hand there is obviously legacy code worth keeping.
By using hardware, software, APIs, languages, technologies or features that are either no longer supported or have been superceded, typically combined with little to no possibility of ever replacing that code, instead using it til it or the system dies.
What is it that makes code, legacy code?
As with plain legacy, when the author is dead or missing, you as a heir get all or some of his code.
You shed some tears and try to figure out what to do with all this rubbish.
Michael Feathers has an interesting definition in his book Working Effectively with Legacy Code. According to him legacy code is code without automated tests.
It is a very general (and oft abused term) but any of the following would be legitimate reasons to call an app legacy:
The code base is based on a language/platform which is entirely unsupported by the manufacturer of the original product (often said manufacturer has gone out of business).
(really 1a) The code base or platform on which it is built is so old that getting qualified or experienced developers for the system is both hard and expensive.
The application supports some aspect of the business which is no longer actively grown and for which alterations are extremely rare, normally to fix it if something entirely unexpected changes around it (the canonical example being the Y2K issue) or if some regulation/external pressure forces it. Since both reasons are pressing and normally unavoidable but no significant development has occurred on the project it is likely that those people assigned to deal with this will be unfamiliar with the system (and it's accumulated behaviours and intricacies). In these cases this would often be reason to increase the perceived and planned for risk associated with the project.
The system has/or is being replaced with another. As such the system may be used for much less than originally intended, or perhaps only as a means of viewing historical data.
Legacy generally refers to code that is no longer being developed - meaning that if you use it, you have to use it on its original terms - you cannot just edit it to support the way the world looks today. For example, legacy code has to run on hardware that may not exist today - or is no longer supported.
According to Michael Feathers, the author of the excellent Working Effectively with Legacy Code, legacy code is a code which has no tests. When there is no way to know what breaks when this code changes.
The main thing that distinguishes
legacy code from non-legacy code is
tests, or rather a lack of tests. We
can get a sense of this with a little
thought experiment: how easy would it
be to modify your code base if it
could bite back, if it could tell you
when you made a mistake? It would be
pretty easy, wouldn't it? Most of the
fear involved in making changes to
large code bases is fear of
introducing subtle bugs; fear of
changing things inadvertently. With
tests, you can make things better with
impunity. To me, the difference is so
critical, it overwhelms any other
distinction. With tests, you can make
things better. Without them, you just
don’t know whether things are getting
better or worse.
Nobody is gonna read this, but I feel the other answers don't get it quite right:
It has value, if it wasn't useful it would've been thrown away long ago
Its hard to reason about because either of
Lack of documentation,
Original author cannot be found or forgot (yes 2 months later your code can be legacy code too!!),
Lack of tests or typesystem
Doesn't follow modern practices (ie no context to hold on too)
There is a requirement to change or extend it.
If there isn't a requirement to change it, it isn't legacy code
since nobody cares about it. It does its thing and there is nobody
around to call it legacy code.
A colleague once told me that legacy code was any code that you hadn't written yourself.
Arguably, it's just a pejorative term for code that we don't like any more for whatever reason (typically because it's not cool or fashionable but it works).
The TDD brigade might suggest that any code without tests is legacy code.
Legacy code is source code that relates to a no-longer supported or manufactured operating system or other computer technology.
http://en.wikipedia.org/wiki/Legacy_code
"Legacy code is source code that relates to a no-longer supported or manufactured "
Any code with support (or documentation) missing. Be it:
inline comments
technical documentation
spoken documentation (the person who wrote it)
unit tests documenting the workings of the code
For me legacy code is code that was written prior to some paradigm shift.
It may still be very much in use but it is in the process of being refactored to bring it into line.
e.g. Old procedural code hanging around in an otherwise OO system.
Code (or anything else, really) becomes "legacy" when it has been replaced by something newer/better, and yet despite this it's still used and kept alive "in the wild".
Preserving legacy code is not so much an academic ideal as it is keeping code that works, no matter how poorly. In many conservative enterprise situations, that would be considered more practical than throwing it away and starting again from scratch. Better the devil you know...
Legacy code is code that is painful/expensive to keep current with changing requirements.
There are two ways that this can happen:
The code is unsuitable for change
The semantics of the code have been swapped out to silicon
1) is the easier of the two to recognize. It is software that has fundamental limits making it unable to keep up with the ecosystem around it. For example, a system built around O(n^2) algorithm won't scale beyond a certain point and must be re-written if requirements move in that direction. Another example is code using libraries that are not supported on the latest OS versions.
2) Is harder to recognize, but all code of this kind shares the characteristic that people are afraid to change it. This could be because it was badly written/documented to begin with, because it is untested, or because it is non-trivial and the original authors who understood it left the team.
The ASCII/Unicode chars that comprise living code have semantic meaning, the "why's", "what's" and to some degree the "how's", in the minds of people associated with it. Legacy code is either un-owned or the owners do not have meaning associated with large portions of it. Once this happens (and it could happen the next day with really poorly-written code), to change this code, someone must learn it and understand it. This process is a significant fraction of the time it takes to write it in the first place.
The day you're afraid to refactor your code is the day when your code has become legacy.
I consider code "legacy" if any or all of the following conditions apply:
It was written using a language or methodology that is a generation behind current standards
The code is a complete mess with no planning or design behind it
It is written in outdated languages and in an outdated, non object-oriented style
It is difficult to find developers who know the language because it is so old
Unlike some of the other opinions here, I've seen plenty of modern applications that work decently without unit tests. Unit testing still has not caught on with everyone. Perhaps ten years from now the next generation of programmers will look at our current applications and consider them "legacy" for not containing unit tests, just as I consider non object-oriented applications to be legacy.
If few changes need to be made to a legacy codebase, it's better to simply leave it as-is and go with the flow. If the application needs drastic functionality changes, a GUI overhaul, and/or you can't find anyone who knows the programming language, it's time to throw away and start over. A word of warning, however: rewriting from scratch can be very time-consuming, and it's difficult to know if you've replicated all functionality. You'll probably want to have test cases and unit tests written for the legacy application and the new application.
Quite honestly legacy code is any code, framework, api, of other software construct thta's not "cool" anymore. For example COBOL is unanimously regarded as legacy while APL is not. Now one can also make the case that COBOL is consideed legacy and APL not because it has about 1m times the install base as APL. However, if you say that you need to work on APL code the reply would not be "oh no, that legacy stuff" but rather "oh my god, guess you won't be doing anything for the next century" see the difference?
This is a general term thrown around quite often (and quite generically) in the software ecosystem.
Well, I like to think of legacy code as inherited code. This is simply code that was written in the past. In most cases, legacy code do not follow new/current practices and is often considered archaic.
Legacy code is anything written more than a month ago :-)
It's often any code that isn't written in the trendy scripting language du jour, and I'm only half joking.