Related
(This is a CS-theory type of question; I hope that's acceptable.)
The "Lisp-1 vs Lisp-2" debate is about whether the namespace of functions should be distinct from the namespace of all other variables, and it's relevant in dynamically typed languages that allow the programmer to pass around functions as values. Lisp-1 languages (such as Scheme) have one namespace, so you can't have both a function named f and also an integer named f (one would shadow the other, just like two integers named f). Lisp-2 languages (such as Common Lisp) have two namespaces, so you can have both f variables, but you have to specify which one you mean with special syntax (#'f is the function and f is the integer).
It seems to me that the main technical problem, the need to disambiguate the function from the integer, is not an issue if the language is also statically typed (unlike most Lisps). For instance, if a sort function requires a list and a less-than function as an explicit signature,
def sort[X](list: List[X], lessThan: Function[X, Boolean]) // Scala syntax; had to pick something
then it wouldn't matter if the functions and everything else are in the same namespace or not. sort(mylist, myless) would only pass a type check if myless is a function--- no special syntax needed. Some people argue that one namespace is more aesthetically pleasing than two namespaces, but I'd like to focus on technical issues.
Is there anything that two namespaces would make more difficult or more prone to error (or conversely for one namespace), assuming that the language in question is statically typed?
(I'm thinking about this in the context of a domain specific language that I'm working on, and I want to make sure that I don't run into problems down the road. It would be easier to implement with two namespaces (Lisp-2), and since it's statically typed, there's no need for the equivalent of #'f. I asked the question in general because I want to hear general points and perhaps become aware of questions that I don't yet know to ask.)
One very common objection to multiple namespaces is that it complicates the formal semantics to have an arbitrary number of namespaces. One namespace makes things simple. The next simplest, so I'm told (I don't write these things), is an infinite number of namespaces--something I tried to do once and only got halfway through (see here if curious, though I think it's not what you're asking for in this case). It's when you limit it to a finite number of namespaces that a formal semantics gets messy, or so I'm told. And certainly it makes any kind of compiler or interpreter a bit more complex, too. This objection straddles between aesthetic and technical in that it's not an objection based on technical difficulty per se, as no super-human intelligence is required to do multiple namespaces, just a lot of extra manual labor, but rather an objection that doing a more complex semantics means more code, more special cases, more chances for error, etc. Personally, I'm not swayed by such arguments, I merely point them out since you ask. I think you'll find such arguments aren't fatal and that it's fine to proceed on what you're doing and see where it leads. I care much more about programmer/end-user experiences than implementor difficulty. But I mention this argument because other people I respect seem to think this is a big deal and I believe it's the correct answer to your question about difficulty and error-prone-ness.
Note, by the way, that when Gabriel and I wrote Technical Issues of Separation in Function Cells and Value Cells, I needed words to help me avoid saying "Scheme-like" and "Lisp-like" because people had unrelated reasons for preferring Scheme that had nothing to do with namespacing. Using the terms "Lisp1" and "Lisp2" allowed me keep the paper from becoming a Scheme vs. Lisp debate, and to focus readers narrowly on the matter of namespace. In the end, ANSI Common Lisp ended up with at least 4 namespaces (functions, values, go tags, block tags) or maybe 5 if you count types (which are sort of like a namespace in some ways but not others), but in any case not 2. So strictly it would be a Lisp4 or Lisp5. But either way it still horrifies those who prefer a single namespace because any arbitrary finite number bigger than one tends to be unacceptable to them.
My personal recommendation is that you should design the language according to your personal sense of what's right. For some languages, one namespace is fine. For others not. Just don't do it because someone tells you that either way has to be the right way--it's really legitimately your choice to make.
Some argue that one namespace is conceptually simpler, but I think that depends on the notion of simplicity. Some say smaller is simpler (notationally or implementationally). I claim that what's conceptually simplest is what's most isomorphic to how your brain thinks about things, and that your brain may not always be thinking "small" for "simple". I'd cite as at least one example that every human language in existence has multiple definitions for the same word, almost all resolved by context (of which your type inferencing might be one example of contextual information), suggesting that wetware is designed to disambiguate such things and that your brain is wasting some of its natural capacity if you don't let it care about such things. Maybe indeed this contextual inferencing increases the complexity of your language implementation, but languages are implemented only once, and are used many times, so there's every reason to optimize the end user experience, not the implementor experience. It's an investment that pays off later.
Insist on thinking about things how you want. Worry more about making your language have a good feel and about whether what you want to do is implementable at all, even if it does require work and extra checking of the code. No language design decision is the right one for every language--languages are ecologies and what matters is that language elements work well with one another, not that language elements match other languages. Indeed, if languages were going to be all the same, it would be pointless to have multiple languages.
I was thinking about the fact that we can prove a program has bugs. We can test it to assess that it is more or less bug resistant.
But is there a way (even theoretically) to prove that a program has no bug ?
For simple programs, such as a "Hello World", I guess we should be able to do it.
But what about larger programs ?
There are nowadays many different formalisms that can be used to prove programs correct (e.g., formalizations in proof assistants, dependently typed programming languages, separation logic, ...). As noted by others, there is no automatic way to prove any given program correct (see the halting problem). However, those mentioned formalisms are often applicable to specific programs. (Such an application can be far from automatic and require a tremendous amount of creativity.)
Another very important point is what we actually mean by proving a program correct or as you stated prove that a program has no bug. Even with formal methods there is typically no way to say that nothing whatsoever can go wrong with a program. The reason is that formal methods usually show that a program conforms to a specification.
You can think of a specification as a logical formula (that states some property about a program) and of the correctness proof as a formal proof that a program satisfies this formula (i.e., enjoys the corresponding property). Due to this setup, everything outside the specification is not even "considered" by the proof. So to really show that a program has no bugs you would first have to write down a logical formula that states when a program does not have bugs.
So it would maybe be more honest to say that formal methods are often able to prove (without doubt) that a program does not have certain kinds of bugs (depending on the used specification).
I have good knowledge of C++ but never dwell into STL. What part of STL I must learn to improve productivity and reduce defects in my work?
Thanks.
I have good knowledge of C++
With all due respect, but no – you don’t. The standard library, or at least large parts of it (especially the subset known as “STL”) is a fundamental part of C++. Without knowledge of it you don’t know very much about C++ at all.
In fact, much of the modern design of C++ (essentially everything since the 98 version) was guided by design considerations stemming from the standard library, and much of the changes in the language since then are changes to the standard library. If you take a look at the official C++ language description a good part of the document is concerned with the library.
Usually the first reaction (at least in my opinion, of course) for people who have not worked with the STL before is to get upset with all the template code. So I would start by studying a little bit on this subject.
In the case you already know template fundamentals I would recommend taking a brief look over an STL design document. This is actually the second stage of hassle for people not yet familiar with it. The reason for this is that the STL is not designed under a typical object oriented paradigm, but under the generic programming paradigm.
With this in mind, a good start could be this introductory article. The terms used throughout the STL components are explained there. Please notice that is a relatively old text focused on the SGI implementation (which predates the C++ standard and incorrectly mentions, for example, the hash based containers as part of it). However, the theory is still valid.
Well, if you already know most things I've said up to this point, just jump directly to the topcis the others provided.
You mention about improving your productivity and reduce defects. There are general guidelines that you can use for this. I assume c++11, and I mention a bit more than stl (smart pointers):
Use containers, they will manage memory for you. You get rid of new for C arrays and later having to delete them, for example.
For dynamic arrays, use std::vector. You also have hashtables in std::unordered_map and balanced trees with std::map. There are more containers, take a look here.
Use std::array instead of plain C arrays whenever you can: they never decay to pointers when passing as arguments to functions, which can avoid very disgusting bugs.
Use smart pointers and forget forever for a naked new and its matching delete in code.
This can reduce errors far more than you would expect, especially in the presence of exceptions.
Use std::make_shared when possible. You can use it to allocate a shared_ptr directly as an argument to a function that takes a std::shared_ptr. With a naked new this is not possible.
Use algorithms instead of hand-coded loops. The code will be far more readable and usually more performant.
With this advice your code should look closer (but not necessarily equal or semantically equivalent) to C# or Java, in which manual memory management disappears. This is even better than garbage collection in many scenarios, since you have deterministic guarantees for when a resource will be freed.
I'd say the algorithms from <algorithm> will really clean up your code and at the same time make your code more concise.
Obviously, knowledge of all the containers will help you to optimize the bottlenecks of your code caused by a certain choice of container which is not optimal (but be sure to profile first).
These are pretty much the basics and they will help you a lot to make more robust code.
After that you can delve into smart pointers like std::shared_ptr which are almost always better than regular pointers (in my case, at least).
I think can start with containers(vector, list) and alghorithms(binary search, sort).
And as Jesse Emond wrote, the more you know, the better you live)))
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
As we program, we all develop practices and patterns that we use and rely on. However, over time, as our understanding, maturity, and even technology usage changes, we come to realize that some practices that we once thought were great are not (or no longer apply).
An example of a practice I once used quite often, but have in recent years changed, is the use of the Singleton object pattern.
Through my own experience and long debates with colleagues, I've come to realize that singletons are not always desirable - they can make testing more difficult (by inhibiting techniques like mocking) and can create undesirable coupling between parts of a system. Instead, I now use object factories (typically with a IoC container) that hide the nature and existence of singletons from parts of the system that don't care - or need to know. Instead, they rely on a factory (or service locator) to acquire access to such objects.
My questions to the community, in the spirit of self-improvement, are:
What programming patterns or practices have you reconsidered recently, and now try to avoid?
What did you decide to replace them with?
//Coming out of university, we were taught to ensure we always had an abundance
//of commenting around our code. But applying that to the real world, made it
//clear that over-commenting not only has the potential to confuse/complicate
//things but can make the code hard to follow. Now I spend more time on
//improving the simplicity and readability of the code and inserting fewer yet
//relevant comments, instead of spending that time writing overly-descriptive
//commentaries all throughout the code.
Single return points.
I once preferred a single return point for each method, because with that I could ensure that any cleanup needed by the routine was not overlooked.
Since then, I've moved to much smaller routines - so the likelihood of overlooking cleanup is reduced and in fact the need for cleanup is reduced - and find that early returns reduce the apparent complexity (the nesting level) of the code. Artifacts of the single return point - keeping "result" variables around, keeping flag variables, conditional clauses for not-already-done situations - make the code appear much more complex than it actually is, make it harder to read and maintain. Early exits, and smaller methods, are the way to go.
Trying to code things perfectly on the first try.
Trying to create perfect OO model before coding.
Designing everything for flexibility and future improvements.
In one word overengineering.
Hungarian notation (both Forms and Systems).
I used to prefix everything. strSomeString or txtFoo.
Now I use someString and textBoxFoo. It's far more readable and easier for someone new to come along and pick up. As an added bonus, it's trivial to keep it consistant -- camelCase the control and append a useful/descriptive name. Forms Hungarian has the drawback of not always being consistent and Systems Hungarian doesn't really gain you much. Chunking all your variables together isn't really that useful -- especially with modern IDE's.
The "perfect" architecture
I came up with THE architecture a couple of years ago. Pushed myself technically as far as I could so there were 100% loosely coupled layers, extensive use of delegates, and lightweight objects. It was technical heaven.
And it was crap. The technical purity of the architecture just slowed my dev team down aiming for perfection over results and I almost achieved complete failure.
We now have much simpler less technically perfect architecture and our delivery rate has skyrocketed.
The use of caffine. It once kept me awake and in a glorious programming mood, where the code flew from my fingers with feverous fluidity. Now it does nothing, and if I don't have it I get a headache.
Commenting out code. I used to think that code was precious and that you can't just delete those beautiful gems that you crafted. I now delete any commented-out code I come across unless there's a TODO or NOTE attached because it's too perilous to leave it in. To wit, I've come across old classes with huge commented-out portions and it really confused me why they were there: were they recently commented out? is this a dev environment change? why does it do this unrelated block?
Seriously consider not commenting out code and just deleting it instead. If you need it, it's still in source control. YAGNI though.
The overuse / abuse of #region directives. It's just a little thing, but in C#, I previously would use #region directives all over the place, to organize my classes. For example, I'd group all class properties together in a region.
Now I look back at old code and mostly just get annoyed by them. I don't think it really makes things clearer most of the time, and sometimes they just plain slow you down.
So I have now changed my mind and feel that well laid out classes are mostly cleaner without region directives.
Waterfall development in general, and in specific, the practice of writing complete and comprehensive functional and design specifications that are somehow expected to be canonical and then expecting an implementation of those to be correct and acceptable. I've seen it replaced with Scrum, and good riddance to it, I say. The simple fact is that the changing nature of customer needs and desires makes any fixed specification effectively useless; the only way to really properly approach the problem is with an iterative approach. Not that Scrum is a silver bullet, of course; I've seen it misused and abused many, many times. But it beats waterfall.
Never crashing.
It seems like such a good idea, doesn't it? Users don't like programs that crash, so let's write programs that don't crash, and users should like the program, right? That's how I started out.
Nowadays, I'm more inclined to think that if it doesn't work, it shouldn't pretend it's working. Fail as soon as you can, with a good error message. If you don't, your program is going to crash even harder just a few instructions later, but with some nondescript null-pointer error that'll take you an hour to debug.
My favorite "don't crash" pattern is this:
public User readUserFromDb(int id){
User u = null;
try {
ResultSet rs = connection.execute("SELECT * FROM user WHERE id = " + id);
if (rs.moveNext()){
u = new User();
u.setFirstName(rs.get("fname"));
u.setSurname(rs.get("sname"));
// etc
}
} catch (Exception e) {
log.info(e);
}
if (u == null){
u = new User();
u.setFirstName("error communicating with database");
u.setSurname("error communicating with database");
// etc
}
u.setId(id);
return u;
}
Now, instead of asking your users to copy/paste the error message and sending it to you, you'll have to dive into the logs trying to find the log entry. (And since they entered an invalid user ID, there'll be no log entry.)
I thought it made sense to apply design patterns whenever I recognised them.
Little did I know that I was actually copying styles from foreign programming languages, while the language I was working with allowed for far more elegant or easier solutions.
Using multiple (very) different languages opened my eyes and made me realise that I don't have to mis-apply other people's solutions to problems that aren't mine. Now I shudder when I see the factory pattern applied in a language like Ruby.
Obsessive testing. I used to be a rabid proponent of test-first development. For some projects it makes a lot of sense, but I've come to realize that it is not only unfeasible, but rather detrimental to many projects to slavishly adhere to a doctrine of writing unit tests for every single piece of functionality.
Really, slavishly adhering to anything can be detrimental.
This is a small thing, but: Caring about where the braces go (on the same line or next line?), suggested maximum line lengths of code, naming conventions for variables, and other elements of style. I've found that everyone seems to care more about this than I do, so I just go with the flow of whoever I'm working with nowadays.
Edit: The exception to this being, of course, when I'm the one who cares the most (or is the one in a position to set the style for a group). In that case, I do what I want!
(Note that this is not the same as having no consistent style. I think a consistent style in a codebase is very important for readability.)
Perhaps the most important "programming practice" I have since changed my mind about, is the idea that my code is better than everyone else's. This is common for programmers (especially newbies).
Utility libraries. I used to carry around an assembly with a variety of helper methods and classes with the theory that I could use them somewhere else someday.
In reality, I just created a huge namespace with a lot of poorly organized bits of functionality.
Now, I just leave them in the project I created them in. In all probability I'm not going to need it, and if I do, I can always refactor them into something reusable later. Sometimes I will flag them with a //TODO for possible extraction into a common assembly.
Designing more than I coded.
After a while, it turns into analysis paralysis.
The use of a DataSet to perform business logic. This binds the code too tightly to the database, also the DataSet is usually created from SQL which makes things even more fragile. If the SQL or the Database changes it tends to trickle to everything the DataSet touches.
Performing any business logic inside an object constructor. With inheritance and the ability to create overloaded constructors tend to make maintenance difficult.
Abbreviating variable/method/table/... Names
I used to do this all of the time, even when working in languages with no enforced limits on lengths of names (well they were probably 255 or something). One of the side-effects were a lot of comments littered throughout the code explaining the (non-standard) abbreviations. And of course, if the names were changed for any reason...
Now I much prefer to call things what they really are, with good descriptive names. including standard abbreviations only. No need to include useless comments, and the code is far more readable and understandable.
Wrapping existing Data Access components, like the Enterprise Library, with a custom layer of helper methods.
It doesn't make anybody's life easier
Its more code that can have bugs in it
A lot of people know how to use the EntLib data access components. No one but the local team knows how to use the in house data access solution
I first heard about object-oriented programming while reading about Smalltalk in 1984, but I didn't have access to an o-o language until I used the cfront C++ compiler in 1992. I finally got to use Smalltalk in 1995. I had eagerly anticipated o-o technology, and bought into the idea that it would save software development.
Now, I just see o-o as one technique that has some advantages, but it's just one tool in the toolbox. I do most of my work in Python, and I often write standalone functions that are not class members, and I often collect groups of data in tuples or lists where in the past I would have created a class. I still create classes when the data structure is complicated, or I need behavior associated with the data, but I tend to resist it.
I'm actually interested in doing some work in Clojure when I get the time, which doesn't provide o-o facilities, although it can use Java objects if I understand correctly. I'm not ready to say anything like o-o is dead, but personally I'm not the fan I used to be.
In C#, using _notation for private members. I now think it's ugly.
I then changed to this.notation for private members, but found I was inconsistent in using it, so I dropped that too.
I stopped going by the university recommended method of design before implementation. Working in a chaotic and complex system has forced me to change attitude.
Of course I still do code research, especially when I'm about to touch code I've never touched before, but normally I try to focus on as small implementations as possible to get something going first. This is the primary goal. Then gradually refine the logic and let the design just appear by itself. Programming is an iterative process and works very well with an agile approach and with lots of refactoring.
The code will not look at all what you first thought it would look like. Happens every time :)
I used to be big into design-by-contract. This meant putting a lot of error checking at the beginning of all my functions. Contracts are still important, from the perspective of separation of concerns, but rather than try to enforce what my code shouldn't do, I try to use unit tests to verify what it does do.
I would use static's in a lot of methods/classes as it was more concise. When I started writing tests that practice changed very quickly.
Checked Exceptions
An amazing idea on paper - defines the contract clearly, no room for mistake or forgetting to check for some exception condition. I was sold when I first heard about it.
Of course, it turned to be such a mess in practice. To the point of having libraries today like Spring JDBC, which has hiding legacy checked exceptions as one of its main features.
That anything worthwhile was only coded in one particular language. In my case I believed that C was the best language ever and I never had any reason to code anything in any other language... ever.
I have since come to appreciate many different languages and the benefits/functionality they offer. If I want to code something small - quickly - I would use Python. If I want to work on a large project I would code in C++ or C#. If I want to develop a brain tumour I would code in Perl.
When I needed to do some refactoring, I thought it was faster and cleaner to start straightaway and implement the new design, fixing up the connections until they work. Then I realized it's better to do a series of small refactorings to slowly but reliably progress towards the new design.
Perhaps the biggest thing that has changed in my coding practices, as well as in others, is the acceptance of outside classes and libraries downloaded from the internet as the basis for behaviors and functionality in applications. In school at the time I attended college we were encouraged to figure out how to make things better via our own code and rely upon the language to solve our problems. With the advances in all aspects of user interface and service/data consumption this is no longer a realistic notion.
There are certain things which will never change in a language, and having a library that wraps this code in a simpler transaction and in fewer lines of code that I have to write is a blessing. Connecting to a database will always be the same. Selecting an element within the DOM will not change. Sending an email via a server-side script will never change. Having to write this time and again wastes time that I could be using to improve my core logic in the application.
Initializing all class members.
I used to explicitly initialize every class member with something, usually NULL. I have come to realize that this:
normally means that every variable is initialized twice before ever being read
is silly because in most languages automatically initialize variables to NULL.
actually enforces a slight performance hit in most languages
can bloat code on larger projects
Like you, I also have embraced IoC patterns in reducing coupling between various components of my apps. It makes maintenance and parts-swapping much simpler, as long as I can keep each component as independent as possible. I'm also utilizing more object-relational frameworks such as NHibernate to simplify database management chores.
In a nutshell, I'm using "mini" frameworks to aid in building software more quickly and efficiently. These mini-frameworks save lots of time, and if done right can make an application super simple to maintain down the road. Plug 'n Play for the win!
I have heard many developers refer to code as "legacy". Most of the time it is code that has been written by someone who no longer works on the project. What is it that makes code, legacy code?
Update in response to:
"Something handed down from an ancestor or a predecessor or from the past" http://www.thefreedictionary.com/legacy. Clearly you wanted to know something else. Could you clarify or expand your question? S.Lott
I am looking for the symptoms of legacy code that make it unusable or a nightmare to work with. When is it better to throw it away? It is my opinion that code should be thrown away more often and that reinventing the wheel is valuable part of development. The academic ideal of not reinventing the wheel is a nice one but it is not very practical.
On the other hand there is obviously legacy code worth keeping.
By using hardware, software, APIs, languages, technologies or features that are either no longer supported or have been superceded, typically combined with little to no possibility of ever replacing that code, instead using it til it or the system dies.
What is it that makes code, legacy code?
As with plain legacy, when the author is dead or missing, you as a heir get all or some of his code.
You shed some tears and try to figure out what to do with all this rubbish.
Michael Feathers has an interesting definition in his book Working Effectively with Legacy Code. According to him legacy code is code without automated tests.
It is a very general (and oft abused term) but any of the following would be legitimate reasons to call an app legacy:
The code base is based on a language/platform which is entirely unsupported by the manufacturer of the original product (often said manufacturer has gone out of business).
(really 1a) The code base or platform on which it is built is so old that getting qualified or experienced developers for the system is both hard and expensive.
The application supports some aspect of the business which is no longer actively grown and for which alterations are extremely rare, normally to fix it if something entirely unexpected changes around it (the canonical example being the Y2K issue) or if some regulation/external pressure forces it. Since both reasons are pressing and normally unavoidable but no significant development has occurred on the project it is likely that those people assigned to deal with this will be unfamiliar with the system (and it's accumulated behaviours and intricacies). In these cases this would often be reason to increase the perceived and planned for risk associated with the project.
The system has/or is being replaced with another. As such the system may be used for much less than originally intended, or perhaps only as a means of viewing historical data.
Legacy generally refers to code that is no longer being developed - meaning that if you use it, you have to use it on its original terms - you cannot just edit it to support the way the world looks today. For example, legacy code has to run on hardware that may not exist today - or is no longer supported.
According to Michael Feathers, the author of the excellent Working Effectively with Legacy Code, legacy code is a code which has no tests. When there is no way to know what breaks when this code changes.
The main thing that distinguishes
legacy code from non-legacy code is
tests, or rather a lack of tests. We
can get a sense of this with a little
thought experiment: how easy would it
be to modify your code base if it
could bite back, if it could tell you
when you made a mistake? It would be
pretty easy, wouldn't it? Most of the
fear involved in making changes to
large code bases is fear of
introducing subtle bugs; fear of
changing things inadvertently. With
tests, you can make things better with
impunity. To me, the difference is so
critical, it overwhelms any other
distinction. With tests, you can make
things better. Without them, you just
don’t know whether things are getting
better or worse.
Nobody is gonna read this, but I feel the other answers don't get it quite right:
It has value, if it wasn't useful it would've been thrown away long ago
Its hard to reason about because either of
Lack of documentation,
Original author cannot be found or forgot (yes 2 months later your code can be legacy code too!!),
Lack of tests or typesystem
Doesn't follow modern practices (ie no context to hold on too)
There is a requirement to change or extend it.
If there isn't a requirement to change it, it isn't legacy code
since nobody cares about it. It does its thing and there is nobody
around to call it legacy code.
A colleague once told me that legacy code was any code that you hadn't written yourself.
Arguably, it's just a pejorative term for code that we don't like any more for whatever reason (typically because it's not cool or fashionable but it works).
The TDD brigade might suggest that any code without tests is legacy code.
Legacy code is source code that relates to a no-longer supported or manufactured operating system or other computer technology.
http://en.wikipedia.org/wiki/Legacy_code
"Legacy code is source code that relates to a no-longer supported or manufactured "
Any code with support (or documentation) missing. Be it:
inline comments
technical documentation
spoken documentation (the person who wrote it)
unit tests documenting the workings of the code
For me legacy code is code that was written prior to some paradigm shift.
It may still be very much in use but it is in the process of being refactored to bring it into line.
e.g. Old procedural code hanging around in an otherwise OO system.
Code (or anything else, really) becomes "legacy" when it has been replaced by something newer/better, and yet despite this it's still used and kept alive "in the wild".
Preserving legacy code is not so much an academic ideal as it is keeping code that works, no matter how poorly. In many conservative enterprise situations, that would be considered more practical than throwing it away and starting again from scratch. Better the devil you know...
Legacy code is code that is painful/expensive to keep current with changing requirements.
There are two ways that this can happen:
The code is unsuitable for change
The semantics of the code have been swapped out to silicon
1) is the easier of the two to recognize. It is software that has fundamental limits making it unable to keep up with the ecosystem around it. For example, a system built around O(n^2) algorithm won't scale beyond a certain point and must be re-written if requirements move in that direction. Another example is code using libraries that are not supported on the latest OS versions.
2) Is harder to recognize, but all code of this kind shares the characteristic that people are afraid to change it. This could be because it was badly written/documented to begin with, because it is untested, or because it is non-trivial and the original authors who understood it left the team.
The ASCII/Unicode chars that comprise living code have semantic meaning, the "why's", "what's" and to some degree the "how's", in the minds of people associated with it. Legacy code is either un-owned or the owners do not have meaning associated with large portions of it. Once this happens (and it could happen the next day with really poorly-written code), to change this code, someone must learn it and understand it. This process is a significant fraction of the time it takes to write it in the first place.
The day you're afraid to refactor your code is the day when your code has become legacy.
I consider code "legacy" if any or all of the following conditions apply:
It was written using a language or methodology that is a generation behind current standards
The code is a complete mess with no planning or design behind it
It is written in outdated languages and in an outdated, non object-oriented style
It is difficult to find developers who know the language because it is so old
Unlike some of the other opinions here, I've seen plenty of modern applications that work decently without unit tests. Unit testing still has not caught on with everyone. Perhaps ten years from now the next generation of programmers will look at our current applications and consider them "legacy" for not containing unit tests, just as I consider non object-oriented applications to be legacy.
If few changes need to be made to a legacy codebase, it's better to simply leave it as-is and go with the flow. If the application needs drastic functionality changes, a GUI overhaul, and/or you can't find anyone who knows the programming language, it's time to throw away and start over. A word of warning, however: rewriting from scratch can be very time-consuming, and it's difficult to know if you've replicated all functionality. You'll probably want to have test cases and unit tests written for the legacy application and the new application.
Quite honestly legacy code is any code, framework, api, of other software construct thta's not "cool" anymore. For example COBOL is unanimously regarded as legacy while APL is not. Now one can also make the case that COBOL is consideed legacy and APL not because it has about 1m times the install base as APL. However, if you say that you need to work on APL code the reply would not be "oh no, that legacy stuff" but rather "oh my god, guess you won't be doing anything for the next century" see the difference?
This is a general term thrown around quite often (and quite generically) in the software ecosystem.
Well, I like to think of legacy code as inherited code. This is simply code that was written in the past. In most cases, legacy code do not follow new/current practices and is often considered archaic.
Legacy code is anything written more than a month ago :-)
It's often any code that isn't written in the trendy scripting language du jour, and I'm only half joking.