Passing my own project on someone else - what to do? [closed] - language-agnostic

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Often there are situations where a project is passed on someone else. And often this process is unpleasant for both sides - the new owner complains about horrible documentation, bugs and bad design. The original owner is then bothered for months with questions about the project, requests to fix old bugs etc.
I might soon be in a situation where one of my projects will be given to someone else so I can focus on my other projects. I wonder what should I do to make this transfer as smooth as possible. What i already have is a decent documentation, the code is quite good commented and i'm still improving it. Its a medium sized project, not very large but still its not something you can code in a week.
I'm looking for a list of things that should be done in order to help the future owner taking over the project and at the same time will spare me all those annoying questions like "and what does this function do, what purpose does this class have...". I know documentation is a must - what else?
Note: although my project is in C++ i believe this is a language-agnostic question. If there are things you think are specific to some language, please mention them too.

Documentation is one thing, getting it into the head of your new project owner another. IMHO this is a typical situation where "less is more" - the less documentation your colleague has to read to understand something, the better. And, of course, learning takes time - for both of you, accept it.
So
instead of writing lots of documentation, make your code self-commentatory
have all documents / source code etc. in a clean and well named folder structure
make sure your build-process is almost completely automatic
don't forget to document your deployment process, if it is not automatic, too
clean-up, clean-up clean-up!

When taking over a project, documentation is of course desirable, but even more so is a good test suite. Trying to modify a program that you have no means of testing for correctness is a nightmare.

Documentation, but on all levels:
API docs
High level architecture: What components are there, what are their relationships and dependencies
For each component, a high level description pointing to important code sections
Tutorials: If you want to do X, here's how
Data: What data does it use and how, database schemas
Idioms: If you've created some idioms within your code, explain them
And, to start, give the guy a personal introduction to all of the above in person, hopefully doing some needed change in a pair programming way

the new owner complains about horrible documentation, bugs and bad design.
I suspect that no matter what you would do, new owner will always complain about something. People are different, so something that looks easy to understand for you, will look horrible and extremely complicated for someone else.
The original owner is then bothered for months with questions about the project, requests to fix old bugs etc.
In this case you should clearly refuse to help. If you won't refuse, you'll probably end up doing someone else's job for free. If maintaining the project is no longer your job, then the new guy should fix his problem without your help. If "the new guy" can't deal with that, he isn't suitable for the job and should quit.
Its a medium sized project,
"Medium sized" compared to what? How many lines or code, how many files, how many megabytes of code?
I wonder what should I do to make this transfer as smooth as possible. What i already have is a decent documentation, the code is quite good commented and i'm still improving it.
I would handle it like this:
First, do a sweep through the entire code and:
1.1 Remove all commented out blocks of code.
1.2 Remove all unused routines and classes (I'm talking about "forgotten" routines, not parts of utility library).
1.3 Make sure all code follow consistent formatting rules. I.e. you shouldn't mix class_a, ClassA and CClassA in same app, you shouldn't use different styles for putting brackets, etc.
1.4 Make sure that all names (class, variable, function) are self-explanatory. Your code should be as self-explaining as possible - this will save you from writing too much documentation.
1.5 In situations when there is a complicated or hard to understand function, write comments. Keep them as short as possible, and post only when they are absolutely necesarry.
1.6 Try to make sure that there are no known bugs left. If there are known bugs, document them and their behavior.
1.7 Remove garbage from project directories (files that are not used in project, etc.)
1.8 If possible, make sure that code still compiles and works as expected.
Generate html documentation with doxygen. Reveiw it few times, modify code comments a bit until you're satisfied. Or until you're somewhat satisfied with the result. Do not skip this step.
If there is a version control repository (say, git repository) with entire development history, hand it over to a new maintainer, or give him(her?) a functional copy of the repository. This will be useful for (git )bisecting and finding source of the bugs.
Once it is done, and code is transferred to a new maintainer, do not offer "free help", unless you're paid for it (or unless you get something else for helping, or unless it is order from your boss which makes helping new maintainer a part of your current task). Maintaining the code is no longer your job, and if new maintainer can't handle it, he isn't qualified for the job.

I think most of the problems can be avoided with just two simple rules.
Keep the code consistent with platform style guide.
Naming, naming and naming.
If the project is huge, then you just need to run some code camps with the new guys. There's no shortcut for this one.
Remember also that complaining happens mostly because new guy is not qualified enough, i.e. doesn't understand something. That's why it is important to keep things simple. And in case he is more qualified, then I guess you deserve it ;)
Some good advice where to start hacking/changing things is always better than documentation. Consider documentation as a backup material after you are familiar with the code, it should never be the starting point (except if you are exceptional technical writer with unlimited resources and time)

If there is good documentation and commented code as you say, then you've done your part. Just make sure that the documentation includes high-level documentation (architecture, data flow, etc.) as well as lower module or procedure-level documentation.
If this is a situation where you can, I would strongly suggest you protect yourself with some type of contract that specifies what future support (if any) you will provide and for how long.

I think for a situation like this the most important thing is a working, complete build that automatically compiles, documents, and tests the project. That way, there is a well defined point at which the new developer has it working. He can then figure stuff out from the tests and documentation, in principal.

Related

When should I release my code? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I've been holding off on releasing a library I wrote because it is the first library which I'll be releasing publicly. Here are my concerns:
The library isn't complete it is in a very usable state, I'd say it is version 0.3, however it still lacks a number of features which I would like to at some point implement, and control how they're implemented (meaning not merging someones implementation).
I'm fearful of criticism, I know there are a few things which should be reorganized/refactored, but I wrote the initial class quickly to be functional for another project I am working on.
So when is the best time to release? Should I just throw it up on github and work on the issues post-release? Or should I wait until I refactor and feel completely comfortable with what I have written?
Most classes/libraries I see are always very elegantly written, however I have not seen any in very early release stages, are a lot of classes fairly sloppy upon initial release?
Release early, release often.
Criticism is a good thing as long as its constructive. Ignore the haters, pay attention to the folks filing bug reports and commenting.
The internal structure of the code matters, but it matters more if it works for its intended purpose. In general, refactoring will change how code works internally but will not affect how it is used. Same inputs and outputs.
You need to get something half-way
useful first, and then others will say
"hey, that almost works for me", and
they'll get involved in the project.
Linus Torvalds
Linux Times (2004-10-25).
It depends on why you are doing this. If it's to provide something useful and it's useful and has benefits that no other library has, then go for it. Just list the status and what's coming next.
If you are doing this to point to on a resume, get it in good shape (the code, not necessarily feature complete). Imagine a future employer poking around the code to see what it looks like, not downloading and running the code.
Whether you release the code in an incomplete state or not, it's always worthwhile having enough documentation to allow users to understand how to use the library.... even if it's only API docs. Make sure that anything incomplete is tagged as TO DO - it helps to maintain a target list of tasks to complete, and lets users know that the feature/method/whatever hasn't been forgotten.
Providing a set of code style/standard documents (perhaps with architectural notes on class relationships) allows other developers to contribute more readily, and in a manner that enhances the library rather than making it a hotch-potch of spaghetti code. It's never easy releasing a library, then having to refactor, while maintaining backward compatibility for users who have already taken up and are using that library in a production setting.
EDIT
Don't be afraid of criticism... it goes with the territory.
Some critcism can be constructive (take heed of that).
There'll be plenty of other people who criticise your code (for whatever their reason) without being constructive, or who just denegrate your work. The difference is, you've produced the goods, they probably haven't ever contributed to any OS product/library.
Users expect you to fix their problems immediately, or to write their code for them to use your library, or simply say "it doesn't work" without any explanation of what they mean. You have to learn to live with that 24x7x365.
But just once in a while, somebody will thank you for saving them hours of work, or for providing something useful... and suddenly all the stress and hassle feels worthwhile.
I read a document by Joshua Bloch, a pricipal software engineer at Google that talked a lot about the best type of API design. Basically, once you release it, it is more or less set. He says
Public APIs are forever - one chance to get it right
You can check out the slides here. It's definitely worth reading. I have a PDF of it as well; let me know if you need it.

How do you let others trust your code and use it? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I write hobby code from time to time. The thing is these tools, classes or tiny libraries of code end up in a flash stick with hopeless future! I would love to develop my projects further, and let other programmers trust them. If you were going to use something you found on the Internet, what is the most important thing you look for in that programming tool or small library? e.g. would you consider separate documentation a must?
Thanks for all contributers. I'll try my best to summarized what have been said. Feel free to modify the list. Corrections and additions are more that welcome :)
Start a blog and let others know you
are here.
Choose the most
suitable license. Possibly Open
Source licenses are the best for
hobby projects.
Put your project where people can
reach it. Consider google-code,
github, sourceforge or
other sites.
Use public version-control and
bug-tracker, So others can acquire the
latest source code of your project to
compile and use.
Write a decent documentation, beside
commenting your code clearly of
course. The documentation should
explain the purpose of the library
and provide at least simple examples.
Write tests if you are willing to provide real-world code.
If you are building a library, put a
lot of effort into designing a stable
interface.
Get a blog, release code through it. Explain why you wrote it, what problem it solves. And encourage others to improve upon it, keep the code posted as current as possible. If your tools are useful you will very quickly develop a following that 'trusts' your code.
Separate documentation isn't a must for small tools, but anything creeping into the framework world should probably have ample documentation and examples if you want any serious adoption from the community at large.
The most important thing is that the library is that it be open source, so I can read the code myself. If that is not possible then I insist on documentation.
Also consider using a project-hosting site (like google code or github).
Have a clear license with your code if you don't have one already
(preferably one which encourages modifying / improving / sharing your
code ...)
Have public version control and/or a public bug/issue tracker and/or a mailing list. There are a lot of good sites which offer these services for free.
Seperate documentation is not a deciding factor to me (if the code is well documented and the code quality is high).
Documentation explaining why you wrote it, when you started it, and it's intended function. Understanding where you're coming from will allow me to see future ideas as well as short coming you may not have seen.
Technical documentation explaining the API and some examples on how to implement it. Ideally, keep your documentation in the format that follows the language. For example C# tends to use the XML syntax for defining items. This allows me to feel at home when I'm reading it.
Clean code -- I can't stress this enough because far too many people write exceptionally ugly code. If you're code is ugly and/or unreadable, it may be easier for me to write it from scratch on my own. At the very least, make your code consistent. If I can't understand the code, I won't feel comfortable with it.
Historical records explaining your changes. Seeing how the project has grown allows me to plan better. It also allows people to see how you learn from your mistakes and get a sense of your skill level. Compared to a forum, you can get a feel for how fast things get fixed and then placed in to a new release.
Think long and hard on what kind of license you want there. Public domain? BSD? GPL? More restrictive?
A note on whether or not you mind being contacted and if there are any restrictions in this. For example, would you mind updates? Me explaining security holes? Or perhaps you might use a forum or wiki?
The ability for me to get your latest work and/or nightly builds. SVN or something. This is useful so I know if a bug I found is already fixed.
I think that documentation is a key point for your project.
The document must indicate:
what is the purpose of your library
what are the main features
a really short tutorial, to make it run in 5 minutes.
Many examples
I let people trust my code in a number of projects, but I urge people to make and maintain their own tests, and I make sure that I'm content with the unit tests.
Documentation is always good, but I'm very guilty of finding time to do as much as I would like. But having the author fairly contactable is helpful.
Posting it in an open source repository such as code.google.com or sourceforge.net is probably where to start...
Next to attract attention, document clearly and succintly the purpose of the library / application as outline in one of the answer above.
Finally, blogging and direct mail exchanges happen...
One reason documentation helps people trust your code, is that they know whether a given feature is something which you intended the code to do (and which you will, all else being equal, preserve in future versions of the code), or something that the current code just so happens to do, but which might change at any time as a side-effect of a bugfix or just a refactor.
Some people prefer find out what code really does by looking at it, and that's fine, but documentation tells you (a) what the code is supposed to do, and with any luck (b) what the next version of the code will do. If I want to use your code long-term, and take bugfix updates as you provide them, then I need to know that you've designed an interface that I can rely on and that you're willing to stick to. Documenting it is a strong hint that you're at least trying to do that.

How to tell someone that their mod's to my program are not good?

G'day,
This is related to my question on star developers and to this question regarding telling someone that they're writing bad code but I'm looking at a situation that is more specific.
That is, how do I tell a "star" that their changes to a program that I'd written are poorly made and inconsistently implemented without just sounding like I'm annoyed at someone "playing with my stuff"?
The new functionality added was deliberatly left out of the original version of this shell script to keep the it as simple as possible until we got an idea of the errors we were going to see with the system under load.
Basically, I'd argued that to try and second guess all error situations was impossible and in fact could leave us heading down a completely wrong path after having done a lot of work.
After seeing what needed to be added, someone dived in and made the additions but unfortunately:
the logic is not consistent
the variable names no longer describe the data they contain
there are almost no comments at all
the way in which the variables are used is not easy to follow and massively decreases readability and hence maintainability.
I always try and approach coding from the Damien Conway point of view "Always code as if your system is going to be maintained by a psychopath who knows where you live." That is, I try to make it easy for follow and not as an advertisement for my own brilliance. "What does this piece of code do?" exercises are fun and are best left to obfuscation contests IMHO.
Any suggestions greatly received.
cheers,
I would just be honest about it. You don't necessarily need to point every little detail that's wrong, but it's worth having a couple of examples of any general points you're going to make. You might want to make notes about other examples that you don't call out in the first brief feedback, in case they challenge your reasoning.
Try to make sure that the feedback is entirely about the code rather than the person. For example:
Good: The argument validation in foo() seems inconsistent with that in bar(). In foo(), a NullPointerException is thrown if the caller passes in null, whereas bar() throws IllegalArgumentException.
Bad: Your argument validation is all over the place. You throw NullPointerException in foo() but IllegalArgumentException in bar(). Please try to be consistent.
Even with a "please," the second form is talking about the developer rather than the code.
Of course in many cases you don't need to worry about being so careful, but if you think they're going to be very sensitive about it, it's worth making the effort. (Read what you've written carefully, if it's written feedback: I accidentally included a "you" in the first version to start with :)
I've found that most developers (superstar or not) are pretty reasonable about accepting, "No, I didn't implement that feature because it has problem X." It's possible that I've been lucky though.
Coming from the other perspective, I would encourage you to think about it in their shoes. I will describe a "hypothetical" experience.
Some things to keep in mind:
The guy was trying to do something
good.
Programmers are terrible at
mind reading. They tend to only know
what they read.
He may have not been given complete guidance as what needs to be done(or what doesn't need to be done)
He is likely doing the best he knows how to.
Just keep that in mind and talk to them. Teach them. No need for yelling or pissing contests. Just remember that they are not intentionally trying to make your life difficult.
I see that you've asked a lot of questions about how to deal with certain kinds of developers. It seems to be a common thread for you. You keep asking about how to change people around you. If this is a constant problem for you, then perhaps you are the problem.
Now I know you are asking questions to learn how to deal with people you find difficult, and that's good, however, you keep asking (and getting answers) about how to change people.
It seems to me that you need to change. Work with these people to change the code to what you want it to be. With them. Don't try to get them to do it. Just do it, and tell them what you did and why, and ask suggestions for further improvement, and learn from each other. Play off of each other's experience and strengths. Just my 2 cents.
If you have clearly defined coding standards for the project, point out that the code needs to be changed to meet those standards. The list you have there seems like quite reasonable feedback (though #3 is much argued-over; I would only push to document the really confusing parts as fixing the other three points, hopefully, makes the code less confusing).
If there are any other examples you have in your repository from this developer that are several months old, show one to him and ask him what it is doing. (Show him this one in a few months). When he has to zip around to find out what is actually in his variables, and deconstructing every line of code to figure out what it is doing. Break into a code review / pair programming session right there. Refactor and rename together so that he hopefully begins to see for himself exactly why these things are important.
Frankly, I think this is a political problem, not a coding problem. Specifically...
WHO SAID THIS PERSON WAS A "STAR"? If this is the same person you described in your other question, then you already have your answer there: THIS PERSON IS NO "STAR".
So then you get into the other effects of politics...
Who is claiming this person to be a star? Why can you not just tell the person "this is crap code"? Who is protecting them / defending them were you to do that? Can you do that or would you get blasted / demoted / put on the "to be laid off" pile?
You are asking questions that cannot really be answered in isolation. IF the code is crap, then throw it away and do it correctly yourself. IF there are reasons that you cannot do that, then you need to ask yourself if the benefits of this place outweigh the negatives.
Cheers,
-R
Creating a program and then releasing it to be worked on by other developers is tough. You are throwing your code to the mercy of others' development styles, coding conventions, etc.
Telling those developers that they are doing coding poorly, after the code is in, is one of the hardest things that you can do. It is best to address your concerns before they ever start working with your code. This can be done in two ways: Maintaining a detailed coding standard, requiring that submitted code adheres to this and maintaining a development road map, not to just outline when new features will be in, but to create dependencies to avoid such mishaps.
More to your situation, it is important not to criticize or it could cause hostility and worse code coming in. Maybe you can work with that developer to create standards documentation. You will be able to express your ideas about what the standards should be, and you will get their input, without causing any hard feelings.
Always point out the good things in their code, and be sure when discussing the weaknesses that you frame them pointing out the reasons that it will benefit everyone (the developer included), never criticize.
Good luck.
I would do the following:
Make sure he knows that his hard work is appreciated (preferably, this should be the truth)
Ask him if he would mind making a few changes, making it sound like no big deal and easy to fix
Explain the issues, including why they are issues, and suggest specific changes to set him on the right path.
Hopefully, the exercise will help him integrate into the culture project better.
We try to solve these potential 'issues' proactively:
Every 'bigger' project where people work together gets assigned a project 'codelead' (one of the developers). This rotates every project (based on preferences, experience with the particular task ...) so everyone gets to be in the 'contributing' and 'code-project-lead' roles once in a while.
We explicitely made an agreement that
these project 'leads' can decide
whatever they want to with the code
contributions of the others (sort of
like a temporary dictatorship: change
it, make suggestions, ask people to
redo stuff etc.). The projectcode
'lead' bears the complete
responsibility for the aggregated
code to work.
With these formalised 'leads' (and the changing roles) I think people have less problems with (constructive) criticism of the parts they contribute.
Yes, keep the feedback as appreciative, professional and technical as possible, back up your concerns with possible "worst case" scenarios so that the disadvantages of those features and/or this particular implementation, become blatantly obvious.
Also, if this is about features/code that are very specific and are not of any use to most users, express your concerns about the code/use ratio - indicating concerns about increased code base complexity etc.
Ideally, present your concerns as open-ended questions - in the sense of: "Though, I am wondering if this way of doing it may work in the long term, due to ...". So that you actually encourage an active dialogue between contributors.
Invite your fellow contributors and user to provide their opinions on these concerns, in fact ask other people/contributors what they are thinking about this addition (in terms of pros & cons, requirements, code quality), do make the statement that you are willing to reconsider your current position if other contributors/users can provide corresponding insight.
You are basically encouraging an informal review that way, asking your community to also look into the proposed additions, so that the advantages and disadvantages can be discussed.
So, whatever the decision will be, it will be one that is community-backed, and not just simply made by you.
You being the architect of the original design, are also in an excellent position to provide architectural reasons why something is not (yet) suitable for inclusion/deployment.
If stability, complexity or code quality are a real concern, do illustrate how other contributions also had to go through a certain review process in order to be acceptable.
You can also mention how specific code doesn't really align with your current design, or how it may not scale too well with future extension to your current design, similarly you can highlight why certain stuff was left out explicitly.
If you actually like the features or the core idea, be sure to highlight the excellent addition these features would make if properly implemented and integrated, but do also highlight that the existing implementation isn't really appropriate due to a number of reasons.
If you can, do make specific suggestions for improvements, provide examples of how to do things better, and what to avoid and do express that you hope, this can be reworked to be added with the help of your project's community.
Ideally, present your requirements for actually accepting this contribution and do mention the background for your requirements, you may in fact say that you hate some of these requirements yourself.
Preferably, present and discuss instances where you yourself contributed similar code (or even worse code) and that you ended up facing huge issues due to your own code, so that these policies are now in place to prevent such issues. By actually talking about your own bad code, you can actually be very subjective.
Emphasize that you generally appreciate the effort itself, and that you are willing to provide the necessary help and pointers to bring the code in question into a better shape and form. Also, encourage that similar contributions in the future should be properly coordinated within your community, in order to avoid similar issues.
Always think in terms of features and functionality (and remind your contributor to do the same), not code - imagine it like a thorough code review process, where the final code that ends up being committed/accepted, may have hardly anything in common with the original implementation.
This is again a good possibility, to present examples where you yourself developed code that ended up largely reworked, so that much of it is now replaced by a much better implementation.
Similarly, there's always the issue with code that has no active maintainers, so you can just as easily suggest that you feel concerned about code that may end up being unmaintained, you could even ask if the corresponding developer would be willing to help maintain that code, possibly in a separate branch.
In the same sense, always require new code to be accompanied with proper comments, documentation and other updates. In other words, code that adds new or changes existing functionality, should always be accompanied with updates to all relevant documentation.
Ultimately, if you know right away that you cannot and will not accept any of that code in the near future, you can at least invite the developer to branch or even fork your project, possibly in you repository and with your help and guidance, so that you still express your gratitude for working with your project.

What's the best way to become familiar with a large codebase? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Joining an existing team with a large codebase already in place can be daunting. What's the best approach;
Broad; try to get a general overview of how everything links together, from the code
Narrow; focus on small sections of code at a time, understanding how they work fully
Pick a feature to develop and learn as you go along
Try to gain insight from class diagrams and uml, if available (and up to date)
Something else entirely?
I'm working on what is currently an approx 20k line C++ app & library (Edit: small in the grand scheme of things!). In industry I imagine you'd get an introduction by an experienced programmer. However if this is not the case, what can you do to start adding value as quickly as possible?
--
Summary of answers:
Step through code in debug mode to see how it works
Pair up with someone more familiar with the code base than you, taking turns to be the person coding and the person watching/discussing. Rotate partners amongst team members so knowledge gets spread around.
Write unit tests. Start with an assertion of how you think code will work. If it turns out as you expected, you've probably understood the code. If not, you've got a puzzle to solve and or an enquiry to make. (Thanks Donal, this is a great answer)
Go through existing unit tests for functional code, in a similar fashion to above
Read UML, Doxygen generated class diagrams and other documentation to get a broad feel of the code.
Make small edits or bug fixes, then gradually build up
Keep notes, and don't jump in and start developing; it's more valuable to spend time understanding than to generate messy or inappropriate code.
this post is a partial duplicate of the-best-way-to-familiarize-yourself-with-an-inherited-codebase
Start with some small task if possible, debug the code around your problem.
Stepping through code in debug mode is the easiest way to learn how something works.
Another option is to write tests for the features you're interested in. Setting up the test harness is a good way of establishing what dependencies the system has and where its state resides. Each test starts with an assertion about the way you think the system should work. If it turns out to work that way, you've achieved something and you've got some working sample code to reproduce it. If it doesn't work that way, you've got a puzzle to solve and a line of enquiry to follow.
One thing that I usually suggest to people that has not yet been mentioned is that it is important to become a competent user of the existing code base before you can be a developer. When new developers come into our large software project, I suggest that they spend time becoming expert users before diving in trying to work on the code.
Maybe that's obvious, but I have seen a lot of people try to jump into the code too quickly because they are eager to start making progress.
This is quite dependent on what sort of learner and what sort of programmer you are, but:
Broad first - you need an idea of scope and size. This might include skimming docs/uml if they're good. If it's a long term project and you're going to need a full understanding of everything, I might actually read the docs properly. Again, if they're good.
Narrow - pick something manageable and try to understand it. Get a "taste" for the code.
Pick a feature - possibly a different one to the one you just looked at if you're feeling confident, and start making some small changes.
Iterate - assess how well things have gone and see if you could benefit from repeating an early step in more depth.
Pairing with strict rotation.
If possible, while going through the documentation/codebase, try to employ pairing with strict rotation. Meaning, two of you sit together for a fixed period of time (say, a 2 hour session), then you switch pairs, one person will continue working on that task while the other moves to another task with another partner.
In pairs you'll both pick up a piece of knowledge, which can then be fed to other members of the team when the rotation occurs. What's good about this also, is that when a new pair is brought together, the one who worked on the task (in this case, investigating the code) can then summarise and explain the concepts in a more easily understood way. As time progresses everyone should be at a similar level of understanding, and hopefully avoid the "Oh, only John knows that bit of the code" syndrome.
From what I can tell about your scenario, you have a good number for this (3 pairs), however, if you're distributed, or not working to the same timescale, it's unlikely to be possible.
I would suggest running Doxygen on it to get an up-to-date class diagram, then going broad-in for a while. This gives you a quickie big picture that you can use as you get up close and dirty with the code.
I agree that it depends entirely on what type of learner you are. Having said that, I've been at two companies which had very large code-bases to begin with. Typically, I work like this:
If possible, before looking at any of the functional code, I go through unit tests that are already written. These can generally help out quite a lot. If they aren't available, then I do the following.
First, I largely ignore implementation and look only at header files, or just the class interfaces. I try to get an idea of what the purpose of each class is. Second, I go one level deep into the implementation starting with what seems to be the area of most importance. This is hard to gauge, so occasionally I just start at the top and work my way down in the file list. I call this breadth-first learning. After this initial step, I generally go depth-wise through the rest of the code. The initial breadth-first look helps to solidify/fix any ideas I got from the interface level, and then the depth-wise look shows me the patterns that have been used to implement the system, as well as the different design ideas. By depth-first, I mean you basically step through the program using the debugger, stepping into each function to see how it works, and so on. This obviously isn't possible with really large systems, but 20k LOC is not that many. :)
Work with another programmer who is more familiar with the system to develop a new feature or to fix a bug. This is the method that I've seen work out the best.
I think you need to tie this to a particular task. When you have time on your hands, go for whichever approach you are in the mood for.
When you have something that needs to get done, give yourself a narrow focus and get it done.
Get the team to put you on bug fixing for two weeks (if you have two weeks). They'll be happy to get someone to take responsibility for that, and by the end of the period you will have spent so much time problem-solving with the library that you'll probably know it pretty well.
If it has unit tests (I'm betting it doesn't). Start small and make sure the unit tests don't fail. If you stare at the entire codebase at once your eyes will glaze over and you will feel overwhelmed.
If there are no unit tests, you need to focus on the feature that you want. Run the app and look at the results of things that your feature should affect. Then start looking through the code trying to figure out how the app creates the things you want to change. Finally change it and check that the results come out the way you want.
You mentioned it is an app and a library. First change the app and stick to using the library as a user. Then after you learn the library it will be easier to change.
From a top down approach, the app probably has a main loop or a main gui that controls all the action. It is worth understanding the main control flow of the application. It is worth reading the code to give yourself a broad overview of the main flow of the app. If it is a GUI app, creating a paper that shows which screens there are and how to get from one screen to another. If it is a command line app, how the processing is done.
Even in companies it is not unusual to have this approach. Often no one fully understands how an application works. And people don't have time to show you around. They prefer specific questions about specific things so you have to dig in and experiment on your own. Then once you get your specific question you can try to isolate the source of knowledge for that piece of the application and ask it.
Start by understanding the 'problem domain' (is it a payroll system? inventory? real time control or whatever). If you don't understand the jargon the users use, you'll never understand the code.
Then look at the object model; there might already be a diagram or you might have to reverse engineer one (either manually or using a tool as suggested by Doug). At this stage you could also investigate the database (if any), if should follow the object model but it may not, and that's important to know.
Have a look at the change history or bug database, if there's an area that comes up a lot, look into that bit first. This doesn't mean that it's badly written, but that it's the bit everyone uses.
Lastly, keep some notes (I prefer a wiki).
The existing guys can use it to sanity check your assumptions and help you out.
You will need to refer back to it later.
The next new guy on the team will really thank you.
I had a similar situation. I'd say you go like this:
If its a database driven application, start from the database and try to make sense of each table, its fields and then its relation to the other tables.
Once fine with the underlying store, move up to the ORM layer. Those table must have some kind of representation in code.
Once done with that then move on to how and where from these objects are coming from. Interface? what interface? Any validations? What preprocessing takes place on them before they go to the datastore?
This would familiarize you better with the system. Remember that trying to write or understand unit tests is only possible when you know very well what is being tested and why it needs to be tested in only that way.
And in case of a large application that is not driven towards databases, I'd recommend an other approach:
What the main goal of the system?
What are the major components of the system then to solve this problem?
What interactions each of the component has among them? Make a graph that depicts component dependencies. Ask someone already working on it. These componentns must be exchanging something among each other so try to figure out those as well (like IO might be returning File object back to GUI and like)
Once comfortable to this, dive into component that is least dependent among others. Now study how that component is further divided into classes and how they interact wtih each other. This way you've got a hang of a single component in total
Move to the next least dependent component
To the very end, move to the core component that typically would have dependencies on many of the other components which you've already tackled
While looking at the core component, you might be referring back to the components you examined earlier, so dont worry keep working hard!
For the first strategy:
Take the example of this stackoverflow site for instance. Examine the datastore, what is being stored, how being stored, what representations those items have in the code, how an where those are presented on the UI. Where from do they come and what processing takes place on them once they're going back to the datastore.
For the second one
Take the example of a word processor for example. What components are there? IO, UI, Page and like. How these are interacting with each other? Move along as you learn further.
Be relaxed. Written code is someone's mindset, froze logic and thinking style and it would take time to read that mind.
First, if you have team members available who have experience with the code you should arrange for them to do an overview of the code with you. Each team member should provide you with information on their area of expertise. It is usually valuable to get multiple people explaining things, because some will be better at explaining than others and some will have a better understanding than others.
Then, you need to start reading the code for a while without any pressure (a couple of days or a week if your boss will provide that). It often helps to compile/build the project yourself and be able to run the project in debug mode so you can step through the code. Then, start getting your feet wet, fixing small bugs and making small enhancements. You will hopefully soon be ready for a medium-sized project, and later, a big project. Continue to lean on your team-mates as you go - often you can find one in particular who is willing to mentor you.
Don't be too hard on yourself if you struggle - that's normal. It can take a long time, maybe years, to understand a large code base. Actually, it's often the case that even after years there are still some parts of the code that are still a bit scary and opaque. When you get downtime between projects you can dig in to those areas and you'll often find that after a few tries you can figure even those parts out.
Good luck!
You may want to consider looking at source code reverse engineering tools. There are two tools that I know of:
SWAG Kit (Linux only) link
Bauhaus academic commercial
Both tools offer similar feature sets that include static analysis that produces graphs of the relations between modules in the software.
This mostly consists of call graphs and type/class decencies. Viewing this information should give you a good picture of how the parts of the code relate to one another. Using this information, you can dig into the actual source for the parts that you are most interested in and that you need to understand/modify first.
I find that just jumping in to code can be a a bit overwhelming. Try to read as much documentation on the design as possible. This will hopefully explain the purpose and structure of each component. Its best if an existing developer can take you through it but that isn't always possible.
Once you are comfortable with the high level structure of the code, try to fix a bug or two. this will help you get to grips with the actual code.
I like all the answers that say you should use a tool like Doxygen to get a class diagram, and first try to understand the big picture. I totally agree with this.
That said, this largely depends on how well factored the code is to begin with. If its a gigantic mess, it's going to be hard to learn. If its clean, and organized properly, it shouldn't be that bad.
See this answer on how to use test coverage tools to locate the code for a feature of interest, without knowing anything about where that feature is, or how it is spread across many modules.
(shameless marketing ahead)
You should check out nWire. It is an Eclipse plugin for navigating and visualizing large codebases. Many of our customers use it to break-in new developers by printing out visualizations of the major flows.

The best way to familiarize yourself with an inherited codebase

Stacker Nobody asked about the most shocking thing new programmers find as they enter the field.
Very high on the list, is the impact of inheriting a codebase with which one must rapidly become acquainted. It can be quite a shock to suddenly find yourself charged with maintaining N lines of code that has been clobbered together for who knows how long, and to have a short time in which to start contributing to it.
How do you efficiently absorb all this new data? What eases this transition? Is the only real solution to have already contributed to enough open-source projects that the shock wears off?
This also applies to veteran programmers. What techniques do you use to ease the transition into a new codebase?
I added the Community-Building tag to this because I'd also like to hear some war-stories about these transitions. Feel free to share how you handled a particularly stressful learning curve.
Pencil & Notebook ( don't get distracted trying to create a unrequested solution)
Make notes as you go and take an hour every monday to read thru and arrange the notes from previous weeks
with large codebases first impressions can be deceiving and issues tend to rearrange themselves rapidly while you are familiarizing yourself.
Remember the issues from your last work environment aren't necessarily valid or germane in your new environment. Beware of preconceived notions.
The notes/observations you make will help you learn quickly what questions to ask and of whom.
Hopefully you've been gathering the names of all the official (and unofficial) stakeholders.
One of the best ways to familiarize yourself with inherited code is to get your hands dirty. Start with fixing a few simple bugs and work your way into more complex ones. That will warm you up to the code better than trying to systematically review the code.
If there's a requirements or functional specification document (which is hopefully up-to-date), you must read it.
If there's a high-level or detailed design document (which is hopefully up-to-date), you probably should read it.
Another good way is to arrange a "transfer of information" session with the people who are familiar with the code, where they provide a presentation of the high level design and also do a walk-through of important/tricky parts of the code.
Write unit tests. You'll find the warts quicker, and you'll be more confident when the time comes to change the code.
Try to understand the business logic behind the code. Once you know why the code was written in the first place and what it is supposed to do, you can start reading through it, or as someone said, prolly fixing a few bugs here and there
My steps would be:
1.) Setup a source insight( or any good source code browser you use) workspace/project with all the source, header files, in the code base. Browsly at a higher level from the top most function(main) to lowermost function. During this code browsing, keep making notes on a paper/or a word document tracing the flow of the function calls. Do not get into function implementation nitti-gritties in this step, keep that for a later iterations. In this step keep track of what arguments are passed on to functions, return values, how the arguments that are passed to functions are initialized how the value of those arguments set modified, how the return values are used ?
2.) After one iteration of step 1.) after which you have some level of code and data structures used in the code base, setup a MSVC (or any other relevant compiler project according to the programming language of the code base), compile the code, execute with a valid test case, and single step through the code again from main till the last level of function. In between the function calls keep moting the values of variables passed, returned, various code paths taken, various code paths avoided, etc.
3.) Keep repeating 1.) and 2.) in iteratively till you are comfortable up to a point that you can change some code/add some code/find a bug in exisitng code/fix the bug!
-AD
I don't know about this being "the best way", but something I did at a recent job was to write a code spider/parser (in Ruby) that went through and built a call tree (and a reverse call tree) which I could later query. This was slightly non-trivial because we had PHP which called Perl which called SQL functions/procedures. Any other code-crawling tools would help in a similar fashion (i.e. javadoc, rdoc, perldoc, Doxygen etc.).
Reading any unit tests or specs can be quite enlightening.
Documenting things helps (either for yourself, or for other teammates, current and future). Read any existing documentation.
Of course, don't underestimate the power of simply asking a fellow teammate (or your boss!) questions. Early on, I asked as often as necessary "do we have a function/script/foo that does X?"
Go over the core libraries and read the function declarations. If it's C/C++, this means only the headers. Document whatever you don't understand.
The last time I did this, one of the comments I inserted was "This class is never used".
Do try to understand the code by fixing bugs in it. Do correct or maintain documentation. Don't modify comments in the code itself, that risks introducing new bugs.
In our line of work, generally speaking we do no changes to production code without good reason. This includes cosmetic changes; even these can introduce bugs.
No matter how disgusting a section of code seems, don't be tempted to rewrite it unless you have a bugfix or other change to do. If you spot a bug (or possible bug) when reading the code trying to learn it, record the bug for later triage, but don't attempt to fix it.
Another Procedure...
After reading Andy Hunt's "Pragmatic Thinking and Learning - Refactor Your Wetware" (which doesn't address this directly), I picked up a few tips that may be worth mentioning:
Observe Behavior:
If there's a UI, all the better. Use the app and get a mental map of relationships (e.g. links, modals, etc). Look at HTTP request if it helps, but don't put too much emphasis on it -- you just want a light, friendly acquaintance with app.
Acknowledge the Folder Structure:
Once again, this is light. Just see what belongs where, and hope that the structure is semantic enough -- you can always get some top-level information from here.
Analyze Call-Stacks, Top-Down:
Go through and list on paper or some other medium, but try not to type it -- this gets different parts of your brain engaged (build it out of Legos if you have to) -- function-calls, Objects, and variables that are closest to top-level first. Look at constants and modules, make sure you don't dive into fine-grained features if you can help it.
MindMap It!:
Maybe the most important step. Create a very rough draft mapping of your current understanding of the code. Make sure you run through the mindmap quickly. This allows an even spread of different parts of your brain to (mostly R-Mode) to have a say in the map.
Create clouds, boxes, etc. Wherever you initially think they should go on the paper. Feel free to denote boxes with syntactic symbols (e.g. 'F'-Function, 'f'-closure, 'C'-Constant, 'V'-Global Var, 'v'-low-level var, etc). Use arrows: Incoming array for arguments, Outgoing for returns, or what comes more naturally to you.
Start drawing connections to denote relationships. Its ok if it looks messy - this is a first draft.
Make a quick rough revision. Its its too hard to read, do another quick organization of it, but don't do more than one revision.
Open the Debugger:
Validate or invalidate any notions you had after the mapping. Track variables, arguments, returns, etc.
Track HTTP requests etc to get an idea of where the data is coming from. Look at the headers themselves but don't dive into the details of the request body.
MindMap Again!:
Now you should have a decent idea of most of the top-level functionality.
Create a new MindMap that has anything you missed in the first one. You can take more time with this one and even add some relatively small details -- but don't be afraid of what previous notions they may conflict with.
Compare this map with your last one and eliminate any question you had before, jot down new questions, and jot down conflicting perspectives.
Revise this map if its too hazy. Revise as much as you want, but keep revisions to a minimum.
Pretend Its Not Code:
If you can put it into mechanical terms, do so. The most important part of this is to come up with a metaphor for the app's behavior and/or smaller parts of the code. Think of ridiculous things, seriously. If it was an animal, a monster, a star, a robot. What kind would it be. If it was in Star Trek, what would they use it for. Think of many things to weigh it against.
Synthesis over Analysis:
Now you want to see not 'what' but 'how'. Any low-level parts that through you for a loop could be taken out and put into a sterile environment (you control its inputs). What sort of outputs are you getting. Is the system more complex than you originally thought? Simpler? Does it need improvements?
Contribute Something, Dude!:
Write a test, fix a bug, comment it, abstract it. You should have enough ability to start making minor contributions and FAILING IS OK :)! Note on any changes you made in commits, chat, email. If you did something dastardly, you guys can catch it before it goes to production -- if something is wrong, its a great way to get a teammate to clear things up for you. Usually listening to a teammate talk will clear a lot up that made your MindMaps clash.
In a nutshell, the most important thing to do is use a top-down fashion of getting as many different parts of your brain engaged as possible. It may even help to close your laptop and face your seat out the window if possible. Studies have shown that enforcing a deadline creates a "Pressure Hangover" for ~2.5 days after the deadline, which is why deadlines are often best to have on a Friday. So, BE RELAXED, THERE'S NO TIMECRUNCH, AND NOW PROVIDE YOURSELF WITH AN ENVIRONMENT THAT'S SAFE TO FAIL IN. Most of this can be fairly rushed through until you get down to details. Make sure that you don't bypass understanding of high-level topics.
Hope this helps you as well :)
All really good answers here. Just wanted to add few more things:
One can pair architectural understanding with flash cards and re-visiting those can solidify understanding. I find questions such as "Which part of code does X functionality ?", where X could be a useful functionality in your code base.
I also like to open a buffer in emacs and start re-writing some parts of the code base that I want to familiarize myself with and add my own comments etc.
One thing vi and emacs users can do is use tags. Tags are contained in a file ( usually called TAGS ). You generate one or more tags files by a command ( etags for emacs vtags for vi ). Then we you edit source code and you see a confusing function or variable you load the tags file and it will take you to where the function is declared ( not perfect by good enough ). I've actually written some macros that let you navigate source using Alt-cursor,
sort of like popd and pushd in many flavors of UNIX.
BubbaT
The first thing I do before going down into code is to use the application (as several different users, if necessary) to understand all the functionalities and see how they connect (how information flows inside the application).
After that I examine the framework in which the application was built, so that I can make a direct relationship between all the interfaces I have just seen with some View or UI code.
Then I look at the database and any database commands handling layer (if applicable), to understand how that information (which users manipulate) is stored and how it goes to and comes from the application
Finally, after learning where data comes from and how it is displayed I look at the business logic layer to see how data gets transformed.
I believe every application architecture can de divided like this and knowning the overall function (a who is who in your application) might be beneficial before really debugging it or adding new stuff - that is, if you have enough time to do so.
And yes, it also helps a lot to talk with someone who developed the current version of the software. However, if he/she is going to leave the company soon, keep a note on his/her wish list (what they wanted to do for the project but were unable to because of budget contraints).
create documentation for each thing you figured out from the codebase.
find out how it works by exprimentation - changing a few lines here and there and see what happens.
use geany as it speeds up the searching of commonly used variables and functions in the program and adds it to autocomplete.
find out if you can contact the orignal developers of the code base, through facebook or through googling for them.
find out the original purpose of the code and see if the code still fits that purpose or should be rewritten from scratch, in fulfillment of the intended purpose.
find out what frameworks did the code use, what editors did they use to produce the code.
the easiest way to deduce how a code works is by actually replicating how a certain part would have been done by you and rechecking the code if there is such a part.
it's reverse engineering - figuring out something by just trying to reengineer the solution.
most computer programmers have experience in coding, and there are certain patterns that you could look up if that's present in the code.
there are two types of code, object oriented and structurally oriented.
if you know how to do both, you're good to go, but if you aren't familiar with one or the other, you'd have to relearn how to program in that fashion to understand why it was coded that way.
in objected oriented code, you can easily create diagrams documenting the behaviors and methods of each object class.
if it's structurally oriented, meaning by function, create a functions list documenting what each function does and where it appears in the code..
i haven't done either of the above myself, as i'm a web developer it is relatively easy to figure out starting from index.php to the rest of the other pages how something works.
goodluck.