Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Sorry if this is a stupid question but sometimes I see Easter eggs and stuff in programs like Aptitude. (the package manager for Debian)
Is it possible Is it likely that more sinister features make their way into open-source software?
It's certainly possible, but it's more complicated. I don't know of any actual malware going around, but people have made mistakes with similar effects. (I know of mistakes that have been found; obviously, I don't know how many haven't been.)
If you put malware into closed-source software, the only way to find it is to detect the effects and analyze the binary. There are people who are very good at analyzing the binary.
In open-source software, anybody can look at the source code. Not many will, for most packages, but there's a much higher chance of being found out. Once found out, anybody can patch the software to do the good things without the bad. Moreover, most open source software has publicly available repositories, which means that anybody can track down the history of the code, and (at least to a pseudonym) who did what. There is also a tendency to produce more readable code in open source, so that changes will stand out more.
The caveat, of course, is that most of us really don't know what to look for in software security. If I run a compression program, and it compresses my file to a shorter version that looks like gibberish, and I can get the original back, I know that's working. If it changes it to gibberish that it claims is encrypted, I don't know a priori how to tell if it's well encrypted.
It's possible but sort of harder because the source code is there. The author would be counting on no one bothering to read the source code before running it which is true for a lot of people I suppose. I know I don't bother to read the source code of the open source programs I run. In a larger project it's harder because the code is often reviewed but if there's just one author then it becomes a lot easier.
Any software can contain malicious parts (intentionally or unintentionally). The advantage of open source is that you can check it (if you like and have the time to do so).
Yes it's possible, see the Debian OpenSLL debacle for a nice example:
http://www.metasploit.com/users/hdm/tools/debian-openssl/
Although this is not a virus/spyware/malware, it clearly shows what could go wrong in open source software.
I can't believe nobody's mentioned Ken Thompson's compiler virus yet.
Having access to the source code offers a reasonable level of assurance that the program won't behave maliciously. However, unless you've inspected one of:
The binary output of the compiler
The binary code of the compiler itself
The source code of the compiler (and it's compiler, and it's compiler, etc.) and the binary code of the compiler used to compile it.
you could end up with malicious code in the compiled binary that never appears in the source. Admittedly it's a very unlikely and extremely difficult form of attack, but it's theoretically possible to introduce malicious code into an open source project in a way that cannot be detected in the source code (of either the project or the compiler).
If you're not working for the CIA (or the equivalent agency of another government), compiler security probably isn't something you have to worry about. But it is a very cool concept to think about.
I'd say that it obviously it is possible. All it requires is that code gets accepted without sufficient review. It's not hard to come up with scenarios permitting that, since reviewers are human.
The more interesting question then becomes how likely it is that malware gets accepted into some package where it can do harm. This is far harder to answer, unfortunately. We seem to be doing okay so far (knocks on wood), at least.
I guess Linus's law ("with enough eyes, all bugs are shallow") holds true, but it easy to think that just because something is open source, people will spend a lot of time eyeballing its code. That is not generally true, as far as I know.
EDIT: Changed the wording about Linus's law above, had it wrongly attributed.
Is it possible in principle? Of course. Any software can do things people don't want.
Is it possible in practice? The argument against is of course that the software is available, and that many eyes are looking at it, so it'll be discovered before it can do too much damage.
On the other hand, there is the Underhanded C Code Contest, http://www.underhanded-c.org/, in which you submit programs that intentionally misbehave, but where the cause of the bad behavior is not apparent from an inspection of the source.
Then there's of course the Debian SSL bug, where SSL keys generated with the OpenSSL library on Debian where quite insecure. This, apparently, was just an act of incompetence (Hanlon's Razor, everybody), but it shows how security problems can sneak into open source code. With weak keys and SSH access, you don't need a virus in the code, you just exploit weak code when it's running on production systems.
Take that as a yes/no/maybe :)
Yes it is possible, the same that it's possible for closed-source software to have the same occur (malicious developer on the team, etc)
It's arguably less likly with open-source though, as the moment anything like that is noticed, any other user can pull the problem code and it's no longer a problem.
Similar how closed source software can be viruses/spyware/malware, open source can be as well. As well as how there's tons of horrible open source software.
This far every closed source windows software I've seen has been some sort of malware though, so the bias is closed source apps have higher chances being crap in overall.
Who prevents it? Even if software would be open source, Only poor software allows anyone touch the release repository without authorization. Usually there are maintainer(s) who review all the incoming patches.
99% of all software produced and used is poor quality and bug ridden.
It is, but usually its noticed and removed before it becomes an issue. With any well maintained open source software there are many people who check each revision for any changes that were made.
Yes it is possible, it depends how carefully controlled commit access to the source code is and how carefully monitored those commits are. Some projects have a few lead developers who request patches from the community and commit this to the code base, other projects will grant access to many more developers. Equally some projects have a large number of people reviewing the source code as changes are made.
It is possible, indeed as it is code like any proprietary software. However, the main difference is that you -and the community- has access to the code, and this fact is enough to stop it from happening in almost all cases. Also, the vast amount of versions of libraries and kernel makes malware less likely to succeed.
Do you really know what you use? Do you check? Does typical user check or care?
For example google for the keywords: repository compromised or gpg repository compromised or something along these lines.
It is possible, but not very likely.
There's nothing special about open source code that makes it magically resistant to containing bad things, but open source which is actively developed by a group of people is very unlikely to contain malicious code, because someone would notice and blow the whistle.
In addition, in most open source projects it's possible to trace the history of any particular piece of code, by looking through the project's source repository, which means that the author of a malicious piece of code can be identified.
If in doubt, you can always review the code yourself, or hire someone to review it for you. Code review generally won't catch subtle bugs or errors, of course, but malicious code is likely to be more obvious.
Another example where open-source security is superior to that of closed source is the interbase backdoor.
From The Register:
A back door password has been hidden in Borland/Inprise's popular Interbase database software for at least seven years, potentially exposing tens of thousands of private databases at corporations and government agencies to unauthorized access and manipulation over the Internet, experts say.
The password was discovered when Interbase was made open source.
Which doesn't mean that security of open-source software is perfect, or even good. Who needs to insert malware into original software when there are thousands of remotely exploitable security holes all over the place?
Is it likely that more sinister features make their way into open-source software?
Officially no, it's relatively unlikely that malware features will get in and go unnoticed for long. But:
servers holding distribution sources can be (and have been) compromised so that what you're downloading doesn't correspond to the open-source development work;
in the case of binary distributions (usually for Windows), the installer for the software can be packaged with malware. Again, officially this happens quite rarely; one example is early versions of LimeWire, which installed a ‘shopping helper’ affiliate-fee-stealer BHO to “support the project”, and lost a lot of goodwill doing so;
but, there are also some scam artists who squat search results for well-known open source projects (again, most commonly with file-sharing software) and deliver their own tweaked installers bundled with malware. Always find the project's official site before downloading.
Just a reminder,
SSH Backdoored before
Wordpress backdoored before
both of them happened in a result of an attack, so it wasn't planted by original developers. I think this kind of proves that it can but not that likely and will get picked up quite soon if the application is popular enough.
I have heard many developers refer to code as "legacy". Most of the time it is code that has been written by someone who no longer works on the project. What is it that makes code, legacy code?
Update in response to:
"Something handed down from an ancestor or a predecessor or from the past" http://www.thefreedictionary.com/legacy. Clearly you wanted to know something else. Could you clarify or expand your question? S.Lott
I am looking for the symptoms of legacy code that make it unusable or a nightmare to work with. When is it better to throw it away? It is my opinion that code should be thrown away more often and that reinventing the wheel is valuable part of development. The academic ideal of not reinventing the wheel is a nice one but it is not very practical.
On the other hand there is obviously legacy code worth keeping.
By using hardware, software, APIs, languages, technologies or features that are either no longer supported or have been superceded, typically combined with little to no possibility of ever replacing that code, instead using it til it or the system dies.
What is it that makes code, legacy code?
As with plain legacy, when the author is dead or missing, you as a heir get all or some of his code.
You shed some tears and try to figure out what to do with all this rubbish.
Michael Feathers has an interesting definition in his book Working Effectively with Legacy Code. According to him legacy code is code without automated tests.
It is a very general (and oft abused term) but any of the following would be legitimate reasons to call an app legacy:
The code base is based on a language/platform which is entirely unsupported by the manufacturer of the original product (often said manufacturer has gone out of business).
(really 1a) The code base or platform on which it is built is so old that getting qualified or experienced developers for the system is both hard and expensive.
The application supports some aspect of the business which is no longer actively grown and for which alterations are extremely rare, normally to fix it if something entirely unexpected changes around it (the canonical example being the Y2K issue) or if some regulation/external pressure forces it. Since both reasons are pressing and normally unavoidable but no significant development has occurred on the project it is likely that those people assigned to deal with this will be unfamiliar with the system (and it's accumulated behaviours and intricacies). In these cases this would often be reason to increase the perceived and planned for risk associated with the project.
The system has/or is being replaced with another. As such the system may be used for much less than originally intended, or perhaps only as a means of viewing historical data.
Legacy generally refers to code that is no longer being developed - meaning that if you use it, you have to use it on its original terms - you cannot just edit it to support the way the world looks today. For example, legacy code has to run on hardware that may not exist today - or is no longer supported.
According to Michael Feathers, the author of the excellent Working Effectively with Legacy Code, legacy code is a code which has no tests. When there is no way to know what breaks when this code changes.
The main thing that distinguishes
legacy code from non-legacy code is
tests, or rather a lack of tests. We
can get a sense of this with a little
thought experiment: how easy would it
be to modify your code base if it
could bite back, if it could tell you
when you made a mistake? It would be
pretty easy, wouldn't it? Most of the
fear involved in making changes to
large code bases is fear of
introducing subtle bugs; fear of
changing things inadvertently. With
tests, you can make things better with
impunity. To me, the difference is so
critical, it overwhelms any other
distinction. With tests, you can make
things better. Without them, you just
don’t know whether things are getting
better or worse.
Nobody is gonna read this, but I feel the other answers don't get it quite right:
It has value, if it wasn't useful it would've been thrown away long ago
Its hard to reason about because either of
Lack of documentation,
Original author cannot be found or forgot (yes 2 months later your code can be legacy code too!!),
Lack of tests or typesystem
Doesn't follow modern practices (ie no context to hold on too)
There is a requirement to change or extend it.
If there isn't a requirement to change it, it isn't legacy code
since nobody cares about it. It does its thing and there is nobody
around to call it legacy code.
A colleague once told me that legacy code was any code that you hadn't written yourself.
Arguably, it's just a pejorative term for code that we don't like any more for whatever reason (typically because it's not cool or fashionable but it works).
The TDD brigade might suggest that any code without tests is legacy code.
Legacy code is source code that relates to a no-longer supported or manufactured operating system or other computer technology.
http://en.wikipedia.org/wiki/Legacy_code
"Legacy code is source code that relates to a no-longer supported or manufactured "
Any code with support (or documentation) missing. Be it:
inline comments
technical documentation
spoken documentation (the person who wrote it)
unit tests documenting the workings of the code
For me legacy code is code that was written prior to some paradigm shift.
It may still be very much in use but it is in the process of being refactored to bring it into line.
e.g. Old procedural code hanging around in an otherwise OO system.
Code (or anything else, really) becomes "legacy" when it has been replaced by something newer/better, and yet despite this it's still used and kept alive "in the wild".
Preserving legacy code is not so much an academic ideal as it is keeping code that works, no matter how poorly. In many conservative enterprise situations, that would be considered more practical than throwing it away and starting again from scratch. Better the devil you know...
Legacy code is code that is painful/expensive to keep current with changing requirements.
There are two ways that this can happen:
The code is unsuitable for change
The semantics of the code have been swapped out to silicon
1) is the easier of the two to recognize. It is software that has fundamental limits making it unable to keep up with the ecosystem around it. For example, a system built around O(n^2) algorithm won't scale beyond a certain point and must be re-written if requirements move in that direction. Another example is code using libraries that are not supported on the latest OS versions.
2) Is harder to recognize, but all code of this kind shares the characteristic that people are afraid to change it. This could be because it was badly written/documented to begin with, because it is untested, or because it is non-trivial and the original authors who understood it left the team.
The ASCII/Unicode chars that comprise living code have semantic meaning, the "why's", "what's" and to some degree the "how's", in the minds of people associated with it. Legacy code is either un-owned or the owners do not have meaning associated with large portions of it. Once this happens (and it could happen the next day with really poorly-written code), to change this code, someone must learn it and understand it. This process is a significant fraction of the time it takes to write it in the first place.
The day you're afraid to refactor your code is the day when your code has become legacy.
I consider code "legacy" if any or all of the following conditions apply:
It was written using a language or methodology that is a generation behind current standards
The code is a complete mess with no planning or design behind it
It is written in outdated languages and in an outdated, non object-oriented style
It is difficult to find developers who know the language because it is so old
Unlike some of the other opinions here, I've seen plenty of modern applications that work decently without unit tests. Unit testing still has not caught on with everyone. Perhaps ten years from now the next generation of programmers will look at our current applications and consider them "legacy" for not containing unit tests, just as I consider non object-oriented applications to be legacy.
If few changes need to be made to a legacy codebase, it's better to simply leave it as-is and go with the flow. If the application needs drastic functionality changes, a GUI overhaul, and/or you can't find anyone who knows the programming language, it's time to throw away and start over. A word of warning, however: rewriting from scratch can be very time-consuming, and it's difficult to know if you've replicated all functionality. You'll probably want to have test cases and unit tests written for the legacy application and the new application.
Quite honestly legacy code is any code, framework, api, of other software construct thta's not "cool" anymore. For example COBOL is unanimously regarded as legacy while APL is not. Now one can also make the case that COBOL is consideed legacy and APL not because it has about 1m times the install base as APL. However, if you say that you need to work on APL code the reply would not be "oh no, that legacy stuff" but rather "oh my god, guess you won't be doing anything for the next century" see the difference?
This is a general term thrown around quite often (and quite generically) in the software ecosystem.
Well, I like to think of legacy code as inherited code. This is simply code that was written in the past. In most cases, legacy code do not follow new/current practices and is often considered archaic.
Legacy code is anything written more than a month ago :-)
It's often any code that isn't written in the trendy scripting language du jour, and I'm only half joking.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I work at a company where the rule basically is (as I understand it) that you cannot use any code unless (a) you write the code yourself or (b) there is some explicit indemnification clause guarding your use of any other code (like open source code). I am finding this making my coding difficult.
For example, coding samples in books are pretty much use "as-is". Microsoft SDK Code Samples are use "as-is". Blog posts about coding are use "as-is". There are several sites out there with code samples (including SO) that are use at your own risk. No warranties implied or indemnification against intellectual property lawsuits, blah, blah, etc.
Basically, I'm confined to using Asp.Net and the .Net Framework and nothing else and to bar my eyes from accidentally picking something up that I haven't created (ok...that may be my anal interpretation of the rule ;-).
I find this difficult because a big part of learning to code I think is reading other code. Reading blogs that have code, reading books that have code, looking at coding samples, using code from SDK samples etc. Also, I would think it is safe to use code that people have shown to be a good solution or pattern for something and freely put up for others to use. I'm not about to think that I can code everything myself. I definitely have to stand on the coding shoulders of others to reach certain heights.
It could be that I don't understand licensing very well either. From the companie's perspective (I suppose) they don't want to incur any risk of beind sued for IP infringement.
My thought is that you have to weigh risks. Taking a coding snippet from a book is low risk. Incorporating code from an open source library could be high-risk. I say make decisions based on how much risk you are willing to take.
Has anybody had experience working in a situation like this or similar to this? Is this a rare thing or is it common in some sectors? Are there others in the same position like me out there?
Any insight or guidance would be appreciated! Thanks!
Edit:
Thanks for the responses! To clear up some things: I'm not advocating stealing code. I'm talking about code that has some kind of public license that allows it to be used in its defined legal way. The key is there is no indemnification in public licenses in using the code. That means it you use it at your own legal risk (and other risk). If someone sues an open source project that you used code from, you could be roped into the lawsuit as well because you are using the code even though it had a public license.
In 2005, Microsoft was using indemnification to compete against open source venders by promising it's partners that Microsoft would protect them against IP lawsuits. http://www.microsoft.com/presspass/press/2005/jun05/06-22PartnerIndemnificationPR.mspx
So, even if the risk of being sued for IP infringement may be extremely low, it is a non-zero probability. Thus, I can't use any of it. Even if it has a public license of some sort. :-(
The "risk of beind sued for IP infringement" isn't really the right way to think about it. This isn't a "risk" thing.
Either
You have a license and can use the source. There's no risk. You have the license. There can't be a lawsuit.
Or
You don't have a license and you're in violation. Effectively, you will be sued. There's no risk here, either. You're in violation of someone's copyrights (or worse).
Companies are averse to Open Source for a variety of strange reasons. Risk of lawsuit is not one of them.
Things I've heard.
What if it has a virus?
What if it doesn't work as advertised?
What if it "crashes" something? Who do we sue?
None of these are "risk" items. They're "due diligence" items. And mostly, they're easy to address: pick products with enough users that someone else vets the code before you; QA open source as if one of your own people typed it in. Except for one.
This leads us to the real reason. [Hint: It's not "risk of lawsuit".]
There's no one to sue when you didn't perform due diligence on open source.
Most shops don't have real solid configuration management or QA policies (the kind that would stand up in court as best practices). Until they have these things in place, they don't dare think about introducing open source for which you really need solid QA and configuration management.
I think what your company is really worried about is you directly copying large segments of code for which there may be licensing issues, presenting a legal problem to the company if they are caught using it. However, you may read blogs or other non-licensed code and discover a solution which works for the particular problem you are working on. In that case, you would be better off rewriting the code (that is, look at the solution and reproduce it) as opposed to just copying the code and making modifications to it. At my company, that is what they generally recommend for using non-proprietary code.
As well, for small amounts of code (e.g. a standard implementation of a cache) where everyone implements this the same way, every time, your company is unlikely to be afraid of using outside code, as long as you are sure to test it carefully.
By "indemnification", I assume they mean assurance that the code is free of copyright or patent or maybe trade secret encumbrance that they don't know about up front, or that somebody's willing to compensate them if something like that turns up. I've never been in a company that worried about this, nor have I heard of one before.
It's not clear what you actually want here, other than sympathy (and I do have sympathy for people trapped in corporate foolishness). It sounds like the policy is quite rigid, if you're worried about sample code in books. This is a bad policy, and will hinder you, but I don't know what you can do about it. Unlike Joel's blog post on getting things done as a grunt, it sounds like you can't just start doing thing intelligently without being in clear violation of corporate policy.
Not knowing your situation, my suggestion would be to look for another job. This one will definitely stifle your professional growth, and a company with that policy is unlikely to be reasonable about it.
(It would be nice if you could assure them there was no danger, but that's not true. People have lied about copyrights, although open source projects tend not to, and only a fool would claim definitely that a large chunk of code did not infringe on any patents in the US; even if it was written a year before software patents were first awarded, that would be merely good grounds for a court fight, rather than avoiding a court fight. GPLed software is actually better than BSD software, since it requires some patent licensing downstream, but it can't deal with third-party patents. Of course, if they're that worried about being sued, writing in-house software is no solution. That can infringe on patents.)
You could rename the variables and how would they find out? Do they check every line of code ? Universities tell you that all the time, not to copy code without referencing. Why don't you try coding something and useing parts of code you find in the Internet?
Generally you will use more from communities like stack overflow or blogs than from open source projects.
Finally since the code has no warranties, its at your own risk.. well the is the same case if you came up with the code by yourself: its at your own risk.
Hope that helps... and good luck.
It could be that I don't understand licensing very well either. From the companie's perspective (I suppose) they don't want to incur any risk of beind sued for IP infringement.
My thought is that you have to weigh risks. Taking a coding snippet from a book is low risk. Incorporating code from an open source library could be high-risk. I say make decisions based on how much risk you are willing to take.
I'm not sure if I understood correctly. If you are saying that license infringement is fine when you don't get caught, I will have to disagree with you.
You can learn by reading code without breaking laws or getting fired. Just don't copy the code to your company's code base if the license doesn't allow it.
If you're not aware of the "clean room" concept, then there's always that approach. Have a friend look at some open source code and get them to tell you how they think it works. Diagram it out, and then code it yourself.
If it worked for IBM, right?
Keep in mind that not all Open Source is GPL. Your company can copy as much BSD-licensed code as they like. BSD-licensed code has made it into OS X (that's probably my biggest understatement of today) and to a lesser extent Windows NT.
So, I'm looking at using Smalltalk/Squeak for a couple of hobby/academic interest projects, and while trying to read up on the language I came across this nice article. However, this paragraph had me a bit dumbfounded:
"Unfortunately, there is a complete lack of standardization for providing or dealing with modules/packages in Smalltalk. Some dialects provide very strong, comprehensive support for modules/packages (including versioning and distributed access by programming teams,) and other dialects provide little or nothing in this regard. Some dialects provide a robust implementation of multiple, shareable namespaces, others don't. The only commonality is that, when either modules/packages or namespaces are provided, they are implemented as reified objects, in the same way that classes and methods are implemented as reified objects."
So, I have tried googling for it, and this shows up on the Squeak wiki: http://wiki.squeak.org/squeak/734. Does anyone know if this (or something similar) is now part of the standard distribution?
As Mue says, it is not perceived as a big problem in the Squeak community. Prefixing is "good enough". A while back I tried hard to do something better and still maintain the unique feeling of Smalltalk:
http://swiki.krampe.se/gohu/32
...but even though lots of people thought it was nice it didn't catch on. Code more or less works though, but there are several other approaches too - unfortunately most of them just copy some stupid approach from a lesser language thus destroying the feeling of Smalltalk.
Namespaces are not part of Squeak today. But it's a common agreement to prefix all classes of the own project with two or three letters. That's not as save as real namespaces, but it's leightweighted, simple, and works. +smile+
The Google Summer of Code supported a namespace project called Environments. Chris Cunnington is currently investigating it, but he says it looks promising.
Not necessariy related except by name, Squeak 4.5 has a taken another run at the problem, with Colin Putney's Environments package.
Sounds like you should check out Newspeak.
Any useful metrics will be fine
One of the things that I look for in a code is unit test. This will give the freedom to refactor it. So if the code does not have tests I consider it a legacy code.
If the code:
has been replaced by newer code that implements the same or functionality or better
is not being used by current systems
is soon to be replaced by something else altogether
has been archived for historic reasons
when vendors stop supporting it
We use the term "legacy" to refer to any code, still in use, developed using technology we have ceased active development in.
It is code that we would rather rewrite using more recent tools than modify in its current state.
Micheal Feathers, Author of the excellent "Working Effectively with Legacy Code", defines it as any code that does not have tests.
A better question would probably be what marks a piece of code as non legacy.
To me legacy means unchangeable. So as soon as you're no longer 'able' to change it it's legacy.
Whether that ability is removed by fixed requirements, fear of breakage, knowledge loss, or some other impact is largely irrelevant.
A related note is that I don't think I'd ever use the exact word legacy as it stirs up too many emotions to be useful.
I don't believe there is a definitive answer, but I do believe that the likelihood that code is legacy code increases with the number of people who don't want to touch it and the likelihood that changing it will cause it to break.
the term "legacy code" is subjective and is probably a loaded term. but in general I subscribe to the view that legacy code is one that is not unit-testable and as such is hard to refactor.
When the code is old enough you never met the developer who originally wrote the code.
When 3rd party libraries aren't supported anymore.
In my opinion all code that is written is legacy code. It might take some time before the original intent and all the decisions made about the code is forgotten but sooner or later you cannot imagine what they were thinking while writing it. You never write legacy code yourself, right?
Using unit tests or some measure like seconds since the developer has left the building do not really measure whether or not the code is legacy code. Legacy code may have a good set of unit tests and comments and it may have undergone a strict code review and other analysis. This doesn't mean that the code is still relevant for the program at hand. It just suggests that the code might be comparably well written. And if it is no longer relevant, the code will actually make it harder to solve the problem the program is developed for.
Legacy code has been defined in many places as "code without tests". I don't think they are specific in the types of tests, but in general, if you can't make a change to your code without the fear of something unknown happening, well, it quickly devolves.
See "Working Effectively with Legacy Code"
I maybe wrong, but I don't think there is an established metric for this.
Usually a piece of code is deemed to be legacy, when it has seen at least 5-6 release cycles( maybe more ). More often than not, the Original Implementor is no longer around and the code is maintained through.
Almost seconds after the devs leave the premises. :)
If...
there's no money in the bank for new features
you can't find anyone that admits working on the project that needs fixing
the source code to the project you own has gone MIA
...then you're working on legacy code.
Usually people refer to something as legacy code when no one is still around that is familiar with or feels comfortable maintaining the code.
Unit tests make it easier for people unfamiliar with code to dig into it, so the theory is it helps prevent code from becoming "legacy".
Often when code is legacy it is changed in a different manner. People are afraid to change it, but also changes tend to be quick and dirty because nobody understands full consequences. Code duplication issues may arise, because people don't want to take the risk associated with deeper changes.
So, in such circumstances, the situation may get worse, at an increasing rate.
I don't know of any real metrics that can be used to determine if something is "legacy code" or not, but anything older than just written could be considered legacy. Legacy code means different things to different people/organizations, so it really is somewhat subjective.