What's the best way to become familiar with a large codebase? [closed] - legacy

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Joining an existing team with a large codebase already in place can be daunting. What's the best approach;
Broad; try to get a general overview of how everything links together, from the code
Narrow; focus on small sections of code at a time, understanding how they work fully
Pick a feature to develop and learn as you go along
Try to gain insight from class diagrams and uml, if available (and up to date)
Something else entirely?
I'm working on what is currently an approx 20k line C++ app & library (Edit: small in the grand scheme of things!). In industry I imagine you'd get an introduction by an experienced programmer. However if this is not the case, what can you do to start adding value as quickly as possible?
--
Summary of answers:
Step through code in debug mode to see how it works
Pair up with someone more familiar with the code base than you, taking turns to be the person coding and the person watching/discussing. Rotate partners amongst team members so knowledge gets spread around.
Write unit tests. Start with an assertion of how you think code will work. If it turns out as you expected, you've probably understood the code. If not, you've got a puzzle to solve and or an enquiry to make. (Thanks Donal, this is a great answer)
Go through existing unit tests for functional code, in a similar fashion to above
Read UML, Doxygen generated class diagrams and other documentation to get a broad feel of the code.
Make small edits or bug fixes, then gradually build up
Keep notes, and don't jump in and start developing; it's more valuable to spend time understanding than to generate messy or inappropriate code.
this post is a partial duplicate of the-best-way-to-familiarize-yourself-with-an-inherited-codebase

Start with some small task if possible, debug the code around your problem.
Stepping through code in debug mode is the easiest way to learn how something works.

Another option is to write tests for the features you're interested in. Setting up the test harness is a good way of establishing what dependencies the system has and where its state resides. Each test starts with an assertion about the way you think the system should work. If it turns out to work that way, you've achieved something and you've got some working sample code to reproduce it. If it doesn't work that way, you've got a puzzle to solve and a line of enquiry to follow.

One thing that I usually suggest to people that has not yet been mentioned is that it is important to become a competent user of the existing code base before you can be a developer. When new developers come into our large software project, I suggest that they spend time becoming expert users before diving in trying to work on the code.
Maybe that's obvious, but I have seen a lot of people try to jump into the code too quickly because they are eager to start making progress.

This is quite dependent on what sort of learner and what sort of programmer you are, but:
Broad first - you need an idea of scope and size. This might include skimming docs/uml if they're good. If it's a long term project and you're going to need a full understanding of everything, I might actually read the docs properly. Again, if they're good.
Narrow - pick something manageable and try to understand it. Get a "taste" for the code.
Pick a feature - possibly a different one to the one you just looked at if you're feeling confident, and start making some small changes.
Iterate - assess how well things have gone and see if you could benefit from repeating an early step in more depth.

Pairing with strict rotation.
If possible, while going through the documentation/codebase, try to employ pairing with strict rotation. Meaning, two of you sit together for a fixed period of time (say, a 2 hour session), then you switch pairs, one person will continue working on that task while the other moves to another task with another partner.
In pairs you'll both pick up a piece of knowledge, which can then be fed to other members of the team when the rotation occurs. What's good about this also, is that when a new pair is brought together, the one who worked on the task (in this case, investigating the code) can then summarise and explain the concepts in a more easily understood way. As time progresses everyone should be at a similar level of understanding, and hopefully avoid the "Oh, only John knows that bit of the code" syndrome.
From what I can tell about your scenario, you have a good number for this (3 pairs), however, if you're distributed, or not working to the same timescale, it's unlikely to be possible.

I would suggest running Doxygen on it to get an up-to-date class diagram, then going broad-in for a while. This gives you a quickie big picture that you can use as you get up close and dirty with the code.

I agree that it depends entirely on what type of learner you are. Having said that, I've been at two companies which had very large code-bases to begin with. Typically, I work like this:
If possible, before looking at any of the functional code, I go through unit tests that are already written. These can generally help out quite a lot. If they aren't available, then I do the following.
First, I largely ignore implementation and look only at header files, or just the class interfaces. I try to get an idea of what the purpose of each class is. Second, I go one level deep into the implementation starting with what seems to be the area of most importance. This is hard to gauge, so occasionally I just start at the top and work my way down in the file list. I call this breadth-first learning. After this initial step, I generally go depth-wise through the rest of the code. The initial breadth-first look helps to solidify/fix any ideas I got from the interface level, and then the depth-wise look shows me the patterns that have been used to implement the system, as well as the different design ideas. By depth-first, I mean you basically step through the program using the debugger, stepping into each function to see how it works, and so on. This obviously isn't possible with really large systems, but 20k LOC is not that many. :)

Work with another programmer who is more familiar with the system to develop a new feature or to fix a bug. This is the method that I've seen work out the best.

I think you need to tie this to a particular task. When you have time on your hands, go for whichever approach you are in the mood for.
When you have something that needs to get done, give yourself a narrow focus and get it done.

Get the team to put you on bug fixing for two weeks (if you have two weeks). They'll be happy to get someone to take responsibility for that, and by the end of the period you will have spent so much time problem-solving with the library that you'll probably know it pretty well.

If it has unit tests (I'm betting it doesn't). Start small and make sure the unit tests don't fail. If you stare at the entire codebase at once your eyes will glaze over and you will feel overwhelmed.
If there are no unit tests, you need to focus on the feature that you want. Run the app and look at the results of things that your feature should affect. Then start looking through the code trying to figure out how the app creates the things you want to change. Finally change it and check that the results come out the way you want.
You mentioned it is an app and a library. First change the app and stick to using the library as a user. Then after you learn the library it will be easier to change.
From a top down approach, the app probably has a main loop or a main gui that controls all the action. It is worth understanding the main control flow of the application. It is worth reading the code to give yourself a broad overview of the main flow of the app. If it is a GUI app, creating a paper that shows which screens there are and how to get from one screen to another. If it is a command line app, how the processing is done.
Even in companies it is not unusual to have this approach. Often no one fully understands how an application works. And people don't have time to show you around. They prefer specific questions about specific things so you have to dig in and experiment on your own. Then once you get your specific question you can try to isolate the source of knowledge for that piece of the application and ask it.

Start by understanding the 'problem domain' (is it a payroll system? inventory? real time control or whatever). If you don't understand the jargon the users use, you'll never understand the code.
Then look at the object model; there might already be a diagram or you might have to reverse engineer one (either manually or using a tool as suggested by Doug). At this stage you could also investigate the database (if any), if should follow the object model but it may not, and that's important to know.
Have a look at the change history or bug database, if there's an area that comes up a lot, look into that bit first. This doesn't mean that it's badly written, but that it's the bit everyone uses.
Lastly, keep some notes (I prefer a wiki).
The existing guys can use it to sanity check your assumptions and help you out.
You will need to refer back to it later.
The next new guy on the team will really thank you.

I had a similar situation. I'd say you go like this:
If its a database driven application, start from the database and try to make sense of each table, its fields and then its relation to the other tables.
Once fine with the underlying store, move up to the ORM layer. Those table must have some kind of representation in code.
Once done with that then move on to how and where from these objects are coming from. Interface? what interface? Any validations? What preprocessing takes place on them before they go to the datastore?
This would familiarize you better with the system. Remember that trying to write or understand unit tests is only possible when you know very well what is being tested and why it needs to be tested in only that way.
And in case of a large application that is not driven towards databases, I'd recommend an other approach:
What the main goal of the system?
What are the major components of the system then to solve this problem?
What interactions each of the component has among them? Make a graph that depicts component dependencies. Ask someone already working on it. These componentns must be exchanging something among each other so try to figure out those as well (like IO might be returning File object back to GUI and like)
Once comfortable to this, dive into component that is least dependent among others. Now study how that component is further divided into classes and how they interact wtih each other. This way you've got a hang of a single component in total
Move to the next least dependent component
To the very end, move to the core component that typically would have dependencies on many of the other components which you've already tackled
While looking at the core component, you might be referring back to the components you examined earlier, so dont worry keep working hard!
For the first strategy:
Take the example of this stackoverflow site for instance. Examine the datastore, what is being stored, how being stored, what representations those items have in the code, how an where those are presented on the UI. Where from do they come and what processing takes place on them once they're going back to the datastore.
For the second one
Take the example of a word processor for example. What components are there? IO, UI, Page and like. How these are interacting with each other? Move along as you learn further.
Be relaxed. Written code is someone's mindset, froze logic and thinking style and it would take time to read that mind.

First, if you have team members available who have experience with the code you should arrange for them to do an overview of the code with you. Each team member should provide you with information on their area of expertise. It is usually valuable to get multiple people explaining things, because some will be better at explaining than others and some will have a better understanding than others.
Then, you need to start reading the code for a while without any pressure (a couple of days or a week if your boss will provide that). It often helps to compile/build the project yourself and be able to run the project in debug mode so you can step through the code. Then, start getting your feet wet, fixing small bugs and making small enhancements. You will hopefully soon be ready for a medium-sized project, and later, a big project. Continue to lean on your team-mates as you go - often you can find one in particular who is willing to mentor you.
Don't be too hard on yourself if you struggle - that's normal. It can take a long time, maybe years, to understand a large code base. Actually, it's often the case that even after years there are still some parts of the code that are still a bit scary and opaque. When you get downtime between projects you can dig in to those areas and you'll often find that after a few tries you can figure even those parts out.
Good luck!

You may want to consider looking at source code reverse engineering tools. There are two tools that I know of:
SWAG Kit (Linux only) link
Bauhaus academic commercial
Both tools offer similar feature sets that include static analysis that produces graphs of the relations between modules in the software.
This mostly consists of call graphs and type/class decencies. Viewing this information should give you a good picture of how the parts of the code relate to one another. Using this information, you can dig into the actual source for the parts that you are most interested in and that you need to understand/modify first.

I find that just jumping in to code can be a a bit overwhelming. Try to read as much documentation on the design as possible. This will hopefully explain the purpose and structure of each component. Its best if an existing developer can take you through it but that isn't always possible.
Once you are comfortable with the high level structure of the code, try to fix a bug or two. this will help you get to grips with the actual code.

I like all the answers that say you should use a tool like Doxygen to get a class diagram, and first try to understand the big picture. I totally agree with this.
That said, this largely depends on how well factored the code is to begin with. If its a gigantic mess, it's going to be hard to learn. If its clean, and organized properly, it shouldn't be that bad.

See this answer on how to use test coverage tools to locate the code for a feature of interest, without knowing anything about where that feature is, or how it is spread across many modules.

(shameless marketing ahead)
You should check out nWire. It is an Eclipse plugin for navigating and visualizing large codebases. Many of our customers use it to break-in new developers by printing out visualizations of the major flows.

Related

Passing my own project on someone else - what to do? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Often there are situations where a project is passed on someone else. And often this process is unpleasant for both sides - the new owner complains about horrible documentation, bugs and bad design. The original owner is then bothered for months with questions about the project, requests to fix old bugs etc.
I might soon be in a situation where one of my projects will be given to someone else so I can focus on my other projects. I wonder what should I do to make this transfer as smooth as possible. What i already have is a decent documentation, the code is quite good commented and i'm still improving it. Its a medium sized project, not very large but still its not something you can code in a week.
I'm looking for a list of things that should be done in order to help the future owner taking over the project and at the same time will spare me all those annoying questions like "and what does this function do, what purpose does this class have...". I know documentation is a must - what else?
Note: although my project is in C++ i believe this is a language-agnostic question. If there are things you think are specific to some language, please mention them too.
Documentation is one thing, getting it into the head of your new project owner another. IMHO this is a typical situation where "less is more" - the less documentation your colleague has to read to understand something, the better. And, of course, learning takes time - for both of you, accept it.
So
instead of writing lots of documentation, make your code self-commentatory
have all documents / source code etc. in a clean and well named folder structure
make sure your build-process is almost completely automatic
don't forget to document your deployment process, if it is not automatic, too
clean-up, clean-up clean-up!
When taking over a project, documentation is of course desirable, but even more so is a good test suite. Trying to modify a program that you have no means of testing for correctness is a nightmare.
Documentation, but on all levels:
API docs
High level architecture: What components are there, what are their relationships and dependencies
For each component, a high level description pointing to important code sections
Tutorials: If you want to do X, here's how
Data: What data does it use and how, database schemas
Idioms: If you've created some idioms within your code, explain them
And, to start, give the guy a personal introduction to all of the above in person, hopefully doing some needed change in a pair programming way
the new owner complains about horrible documentation, bugs and bad design.
I suspect that no matter what you would do, new owner will always complain about something. People are different, so something that looks easy to understand for you, will look horrible and extremely complicated for someone else.
The original owner is then bothered for months with questions about the project, requests to fix old bugs etc.
In this case you should clearly refuse to help. If you won't refuse, you'll probably end up doing someone else's job for free. If maintaining the project is no longer your job, then the new guy should fix his problem without your help. If "the new guy" can't deal with that, he isn't suitable for the job and should quit.
Its a medium sized project,
"Medium sized" compared to what? How many lines or code, how many files, how many megabytes of code?
I wonder what should I do to make this transfer as smooth as possible. What i already have is a decent documentation, the code is quite good commented and i'm still improving it.
I would handle it like this:
First, do a sweep through the entire code and:
1.1 Remove all commented out blocks of code.
1.2 Remove all unused routines and classes (I'm talking about "forgotten" routines, not parts of utility library).
1.3 Make sure all code follow consistent formatting rules. I.e. you shouldn't mix class_a, ClassA and CClassA in same app, you shouldn't use different styles for putting brackets, etc.
1.4 Make sure that all names (class, variable, function) are self-explanatory. Your code should be as self-explaining as possible - this will save you from writing too much documentation.
1.5 In situations when there is a complicated or hard to understand function, write comments. Keep them as short as possible, and post only when they are absolutely necesarry.
1.6 Try to make sure that there are no known bugs left. If there are known bugs, document them and their behavior.
1.7 Remove garbage from project directories (files that are not used in project, etc.)
1.8 If possible, make sure that code still compiles and works as expected.
Generate html documentation with doxygen. Reveiw it few times, modify code comments a bit until you're satisfied. Or until you're somewhat satisfied with the result. Do not skip this step.
If there is a version control repository (say, git repository) with entire development history, hand it over to a new maintainer, or give him(her?) a functional copy of the repository. This will be useful for (git )bisecting and finding source of the bugs.
Once it is done, and code is transferred to a new maintainer, do not offer "free help", unless you're paid for it (or unless you get something else for helping, or unless it is order from your boss which makes helping new maintainer a part of your current task). Maintaining the code is no longer your job, and if new maintainer can't handle it, he isn't qualified for the job.
I think most of the problems can be avoided with just two simple rules.
Keep the code consistent with platform style guide.
Naming, naming and naming.
If the project is huge, then you just need to run some code camps with the new guys. There's no shortcut for this one.
Remember also that complaining happens mostly because new guy is not qualified enough, i.e. doesn't understand something. That's why it is important to keep things simple. And in case he is more qualified, then I guess you deserve it ;)
Some good advice where to start hacking/changing things is always better than documentation. Consider documentation as a backup material after you are familiar with the code, it should never be the starting point (except if you are exceptional technical writer with unlimited resources and time)
If there is good documentation and commented code as you say, then you've done your part. Just make sure that the documentation includes high-level documentation (architecture, data flow, etc.) as well as lower module or procedure-level documentation.
If this is a situation where you can, I would strongly suggest you protect yourself with some type of contract that specifies what future support (if any) you will provide and for how long.
I think for a situation like this the most important thing is a working, complete build that automatically compiles, documents, and tests the project. That way, there is a well defined point at which the new developer has it working. He can then figure stuff out from the tests and documentation, in principal.

How do you refactor a large messy codebase?

I have a big mess of code. Admittedly, I wrote it myself - a year ago. It's not well commented but it's not very complicated either, so I can understand it -- just not well enough to know where to start as far as refactoring it.
I violated every rule that I have read about over the past year. There are classes with multiple responsibilities, there are indirect accesses (I forget the term - something like foo.bar.doSomething()), and like I said it is not well commented. On top of that, it's the beginnings of a game, so the graphics is coupled with the data, or the places where I tried to decouple graphics and data, I made the data public in order for the graphics to be able to access the data it needs...
It's a huge mess! Where do I start? How would you start on something like this?
My current approach is to take variables and switch them to private and then refactor the pieces that break, but that doesn't seem to be enough. Please suggest other strategies for wading through this mess and turning it into something clean so that I can continue where I left off!
Update two days later: I have been drawing out UML-like diagrams of my classes, and catching some of the "Low Hanging Fruit" along the way. I've even found some bits of code that were the beginnings of new features, but as I'm trying to slim everything down, I've been able to delete those bits and make the project feel cleaner. I'm probably going to refactor as much as possible before rigging my test cases (but only the things that are 100% certain not to impact the functionality, of course!), so that I won't have to refactor test cases as I change functionality. (do you think I'm doing it right or would it, in your opinion, be easier for me to suck it up and write the tests first?)
Please vote for the best answer so that I can mark it fairly! Feel free to add your own answer to the bunch as well, there's still room for you! I'll give it another day or so and then probably mark the highest-voted answer as accepted.
Thanks to everyone who has responded so far!
June 25, 2010: I discovered a blog post which directly answers this question from someone who seems to have a pretty good grasp of programming: (or maybe not, if you read his article :) )
To that end, I do four things when I
need to refactor code:
Determine what the purpose of the code was
Draw UML and action diagrams of the classes involved
Shop around for the right design patterns
Determine clearer names for the current classes and methods
Pick yourself up a copy of Martin Fowler's Refactoring. It has some good advice on ways to break down your refactoring problem. About 75% of the book is little cookbook-style refactoring steps you can do. It also advocates automated unit tests that you can run after each step to prove your code still works.
As for a place to start, I would sit down and draw out a high-level architecture of your program. You don't have to get fancy with detailed UML models, but some basic UML is not a bad idea. You need a big picture idea of how the major pieces fit together so you can visually see where your decoupling is going to happen. Just a page or two of some basic block diagrams will help with the overwhelming feeling you have right now.
Without some sort of high level spec or design, you just risk getting lost again and ending up with another unmaintainable mess.
If you need to start from scratch, remember that you never truly start from scratch. You have some code and the knowledge you gained from your first time. But sometimes it does help to start with a blank project and pull things in as you go, rather than put out fires in a messy code base. Just remember not to completely throw out the old, use it for its good parts and pull them in as you go.
What was most important for me on different occasions were unit tests: I took a few days to write tests for the old code and then I was free to refactor with confidence. How exactly is a different question, but having the tests made it possible for me to make real, substantial changes to the code.
I'll second everyone's recommendations for Fowler's Refactoring, but in your specific case you may want to look at Michael Feathers' Working Effectively with Legacy Code, which is really perfect for your situation.
Feathers talks about Characterization Tests, which are unit tests not to assert known behaviour of the system but to explore and define the existing (unclear) behaviour -- in the case where you've written your own legacy code, and fixing it yourself, this may not be so important, but if your design is sloppy then it's quite possible there are parts of the code that work by 'magic' and their behaviour isn't clear, even to you -- in that case, characterization tests will help.
One great part of the book is the discussion about finding (or creating) seams in your codebase -- seams are natural 'fault lines', if you like, where you can break into the existing system to start testing it, and pulling it towards a better design. Hard to explain but well worth a read.
There's a brief paper where Feathers fleshes out some of the concepts from the book, but it really is well worth hunting down the whole thing. It's one of my favourites.
Just an additional refactoring that is more important than you think: Name things correctly!
This goes for any variable name and method name. If the name does not accurately reflect what the thing is used for, then rename it to something more accurate. This might require several iterations. If you cannot find a short, and entirely accurate name, then that item does too much and you have an excellent candidate for a code snippet that needs to be split. The names also clearly indicate where the cuts are to be made.
Also, document your stuff. Whenever the answer to WHY? is not clearly conveyed by the answer to HOW? (being the code) you will need to add some documentation. Capturing design decisions is probably the most important task as it is very hard to do in code.
You could always start from "scratch". That doesn't mean scrap it and start from nothing, but try to rethink high-level things from the beginning, since you seem to have learned a lot since the last time you worked on it.
Start from a higher level, and as you build the scaffolding of your new and improved structure, take all the code you can reuse, which will probably be more than you think if you're willing to read through it and make some small changes.
When you're making the changes, be sure to be strict with yourself about following all the good practices you now know, because you will really thank yourself later.
It can be surprisingly refreshing to properly re-make program to do exactly what it did before, only more "cleanly". ;)
As others have mentioned as well, unit-tests are your best friend! They help you ensure that your refactoring works, and if you're starting from "scratch", it's the perfect time to write them.
You're in a much better position than many people facing this problem in that you understand what the code is supposed to do.
Taking variables out of a shared scope, as you're doing, is a great start, in that you're partitioning responsibilities. Ultimately you want each class to express a single responsibility. A few other things you might look at:
Easy targets for refactoring are code that's duplicated in lots of places and long methods.
If you're managing application state through statically initialized singletons or worse, a global state that everything is talking to, consider moving it to a managed initialization system (i.e. a dependency injection framework like spring or guice) or at least make sure that the initialization isn't entangled with the rest of the code.
Centralize and standardize how you're accessing outside resources, especially if you've got things like file locations or urls hardcoded.
Buy an IDE that has good refactoring support. I think IntelliJ is the best, but Eclipse has it now, too.
The unit test idea is key as well. You will want to have a suite of large, overall transactions that will give you the overall behavior of the code.
Once you have those, start creating unit tests for classes and smaller packages. Write the tests to demonstrate proper behavior, make your changes, and re-run the tests to demonstrate that you haven't broken everything.
Track code coverage as you go. You'll want to work it up to 70% or better. For the classes you change, you'll want those to be 70% or better before you make your changes.
Build up that safety net over time and you'll be able to refactor with some confidence.
very slowly :D
No seriously... take it one step at a time. For instance, refactor something only if it affects or helps you write the current bug/feature that you are working on right now and do no more than that. And before you refactor make darn sure that you have some kind of automated test in place that gets run on each build that will actually test what you are writing/refactoring. Even if you don't have unit tests, it is never too late to start adding them for all new and modified code that is being written. Over time, your code base will get better in small increments daily or weekly instead of worse - all without you making monumental heaps of changes.
In my personal opinion and experience, it's not worth it to just refactor a (legacy) codebase en masse for the sake of refactoring. In those cases, it's best to just start from scratch and do it right all over again (and very rarely are you afforded the opportunity to do such a thing). Hence, just refactoring incremental is the way to go.
For Java code, my favorite first step is to run Findbugs and then remove all the dead stores, un-used fields, unreachable catch blocks, unused private methods and likely bugs.
Next I run CPD to look for evidence of cut-copy-paste code.
It isn't unusual to be able to reduce the code base by 5% by doing this. It also saves you from refactoring code that is never used.
I think you should use Eclipse as a IDE because it is having many plugins and free of cost.You should now follow the MVC pattern and yes must write test cases using JUnit.Eclipse also have plugin for JUnit and it is providing code refactoring facility too so that will reduce your some work.And always remember that writing a code is not important the main thing is to write clean code.So now give comments everywhere so that not only you but any other person read the code then while reading the code he must feel that he is reading an essay.
Refactor the low-hanging fruit. Nibble away at the easy bits, and as you do that, the harder bits will begin to be a little easier. When there aren't any bits left to refactor, you're done.
The refactorings you'll probably find most useful are Rename Method (and even more trivial Renamings like Field, Variable, and Parameter), Extract Method, and Extract Class. For each refactoring you perform, write the necessary unit tests to make the refactoring safe, and run the entire suite of unit tests after each refactoring. It's tempting - and, let's be honest, pretty safe - to rely on the automated refactorings of your IDE, without the tests - but it's good practice and will be good to have the tests into the future as you add functionality to your project.
You might want to look at Martin Fowler's book Refactoring. This is the book that popularized the term and technique (my thought when taking his course: "I've been doing a lot of this all along, I didn't know it had a name"). A quote from the link:
Refactoring is a controlled technique
for improving the design of an
existing code base. Its essence is
applying a series of small
behavior-preserving transformations,
each of which "too small to be worth
doing". However the cumulative effect
of each of these transformations is
quite significant. By doing them in
small steps you reduce the risk of
introducing errors. You also avoid
having the system broken while you are
carrying out the restructuring - which
allows you to gradually refactor a
system over an extended period of
time.
As others have pointed out, unit tests will allow you to refactor with confidence. And start by reducing code duplication. The book will give you lots of other insights.
Here is a catalog of refactorings.
The correct definition of messy code, is code that hard to maintain and change.
To use more mathematical definition, you can check your code by code metrics tools.
This way, you will keep the code that already good enough, and find very fast, the wrong code.
My experience say, that is very powerful way to improve the quality of your code. (if your tool can show you the result on each build or on realtime)
Throw it away, build it new.

How do you plan an application's architecture before writing any code? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
One thing I struggle with is planning an application's architecture before writing any code.
I don't mean gathering requirements to narrow in on what the application needs to do, but rather effectively thinking about a good way to lay out the overall class, data and flow structures, and iterating those thoughts so that I have a credible plan of action in mind before even opening the IDE. At the moment it is all to easy to just open the IDE, create a blank project, start writing bits and bobs and let the design 'grow out' from there.
I gather UML is one way to do this but I have no experience with it so it seems kind of nebulous.
How do you plan an application's architecture before writing any code? If UML is the way to go, can you recommend a concise and practical introduction for a developer of smallish applications?
I appreciate your input.
I consider the following:
what the system is supposed to do, that is, what is the problem that the system is trying to solve
who is the customer and what are their wishes
what the system has to integrate with
are there any legacy aspects that need to be considered
what are the user interractions
etc...
Then I start looking at the system as a black box and:
what are the interactions that need to happen with that black box
what are the behaviours that need to happen inside the black box, i.e. what needs to happen to those interactions for the black box to exhibit the desired behaviour at a higher level, e.g. receive and process incoming messages from a reservation system, update a database etc.
Then this will start to give you a view of the system that consists of various internal black boxes, each of which can be broken down further in the same manner.
UML is very good to represent such behaviour. You can describe most systems just using two of the many components of UML, namely:
class diagrams, and
sequence diagrams.
You may need activity diagrams as well if there is any parallelism in the behaviour that needs to be described.
A good resource for learning UML is Martin Fowler's excellent book "UML Distilled" (Amazon link - sanitised for the script kiddie link nazis out there (-: ). This book gives you a quick look at the essential parts of each of the components of UML.
Oh. What I've described is pretty much Ivar Jacobson's approach. Jacobson is one of the Three Amigos of OO. In fact UML was initially developed by the other two persons that form the Three Amigos, Grady Booch and Jim Rumbaugh
I really find that a first-off of writing on paper or whiteboard is really crucial. Then move to UML if you want, but nothing beats the flexibility of just drawing it by hand at first.
You should definitely take a look at Steve McConnell's Code Complete-
and especially at his giveaway chapter on "Design in Construction"
You can download it from his website:
http://cc2e.com/File.ashx?cid=336
If you're developing for .NET, Microsoft have just published (as a free e-book!) the Application Architecture Guide 2.0b1. It provides loads of really good information about planning your architecture before writing any code.
If you were desperate I expect you could use large chunks of it for non-.NET-based architectures.
I'll preface this by saying that I do mostly web development where much of the architecture is already decided in advance (WebForms, now MVC) and most of my projects are reasonably small, one-person efforts that take less than a year. I also know going in that I'll have an ORM and DAL to handle my business object and data interaction, respectively. Recently, I've switched to using LINQ for this, so much of the "design" becomes database design and mapping via the DBML designer.
Typically, I work in a TDD (test driven development) manner. I don't spend a lot of time up front working on architectural or design details. I do gather the overall interaction of the user with the application via stories. I use the stories to work out the interaction design and discover the major components of the application. I do a lot of whiteboarding during this process with the customer -- sometimes capturing details with a digital camera if they seem important enough to keep in diagram form. Mainly my stories get captured in story form in a wiki. Eventually, the stories get organized into releases and iterations.
By this time I usually have a pretty good idea of the architecture. If it's complicated or there are unusual bits -- things that differ from my normal practices -- or I'm working with someone else (not typical), I'll diagram things (again on a whiteboard). The same is true of complicated interactions -- I may design the page layout and flow on a whiteboard, keeping it (or capturing via camera) until I'm done with that section. Once I have a general idea of where I'm going and what needs to be done first, I'll start writing tests for the first stories. Usually, this goes like: "Okay, to do that I'll need these classes. I'll start with this one and it needs to do this." Then I start merrily TDDing along and the architecture/design grows from the needs of the application.
Periodically, I'll find myself wanting to write some bits of code over again or think "this really smells" and I'll refactor my design to remove duplication or replace the smelly bits with something more elegant. Mostly, I'm concerned with getting the functionality down while following good design principles. I find that using known patterns and paying attention to good principles as you go along works out pretty well.
http://dn.codegear.com/article/31863
I use UML, and find that guide pretty useful and easy to read. Let me know if you need something different.
UML is a notation. It is a way of recording your design, but not (in my opinion) of doing a design. If you need to write things down, I would recommend UML, though, not because it's the "best" but because it is a standard which others probably already know how to read, and it beats inventing your own "standard".
I think the best introduction to UML is still UML Distilled, by Martin Fowler, because it's concise, gives pratical guidance on where to use it, and makes it clear you don't have to buy into the whole UML/RUP story for it to be useful
Doing design is hard.It can't really be captured in one StackOverflow answer. Unfortunately, my design skills, such as they are, have evolved over the years and so I don't have one source I can refer you to.
However, one model I have found useful is robustness analysis (google for it, but there's an intro here). If you have your use-cases for what the system should do, a domain model of what things are involved, then I've found robustness analysis a useful tool in connecting the two and working out what the key components of the system need to be.
But the best advice is read widely, think hard, and practice. It's not a purely teachable skill, you've got to actually do it.
I'm not smart enough to plan ahead more than a little. When I do plan ahead, my plans always come out wrong, but now I've spend n days on bad plans. My limit seems to be about 15 minutes on the whiteboard.
Basically, I do as little work as I can to find out whether I'm headed in the right direction.
I look at my design for critical questions: when A does B to C, will it be fast enough for D? If not, we need a different design. Each of these questions can be answer with a spike. If the spikes look good, then we have the design and it's time to expand on it.
I code in the direction of getting some real customer value as soon as possible, so a customer can tell me where I should be going.
Because I always get things wrong, I rely on refactoring to help me get them right. Refactoring is risky, so I have to write unit tests as I go. Writing unit tests after the fact is hard because of coupling, so I write my tests first. Staying disciplined about this stuff is hard, and a different brain sees things differently, so I like to have a buddy coding with me. My coding buddy has a nose, so I shower regularly.
Let's call it "Extreme Programming".
"White boards, sketches and Post-it notes are excellent design
tools. Complicated modeling tools have a tendency to be more
distracting than illuminating." From Practices of an Agile Developer
by Venkat Subramaniam and Andy Hunt.
I'm not convinced anything can be planned in advance before implementation. I've got 10 years experience, but that's only been at 4 companies (including 2 sites at one company, that were almost polar opposites), and almost all of my experience has been in terms of watching colossal cluster********s occur. I'm starting to think that stuff like refactoring is really the best way to do things, but at the same time I realize that my experience is limited, and I might just be reacting to what I've seen. What I'd really like to know is how to gain the best experience so I'm able to arrive at proper conclusions, but it seems like there's no shortcut and it just involves a lot of time seeing people doing things wrong :(. I'd really like to give a go at working at a company where people do things right (as evidenced by successful product deployments), to know whether I'm a just a contrarian bastard, or if I'm really as smart as I think I am.
I beg to differ: UML can be used for application architecture, but is more often used for technical architecture (frameworks, class or sequence diagrams, ...), because this is where those diagrams can most easily been kept in sync with the development.
Application Architecture occurs when you take some functional specifications (which describe the nature and flows of operations without making any assumptions about a future implementation), and you transform them into technical specifications.
Those specifications represent the applications you need for implementing some business and functional needs.
So if you need to process several large financial portfolios (functional specification), you may determine that you need to divide that large specification into:
a dispatcher to assign those heavy calculations to different servers
a launcher to make sure all calculation servers are up and running before starting to process those portfolios.
a GUI to be able to show what is going on.
a "common" component to develop the specific portfolio algorithms, independently of the rest of the application architecture, in order to facilitate unit testing, but also some functional and regression testing.
So basically, to think about application architecture is to decide what "group of files" you need to develop in a coherent way (you can not develop in the same group of files a launcher, a GUI, a dispatcher, ...: they would not be able to evolve at the same pace)
When an application architecture is well defined, each of its components is usually a good candidate for a configuration component, that is a group of file which can be versionned as a all into a VCS (Version Control System), meaning all its files will be labeled together every time you need to record a snapshot of that application (again, it would be hard to label all your system, each of its application can not be in a stable state at the same time)
I have been doing architecture for a while. I use BPML to first refine the business process and then use UML to capture various details! Third step generally is ERD! By the time you are done with BPML and UML your ERD will be fairly stable! No plan is perfect and no abstraction is going to be 100%. Plan on refactoring, goal is to minimize refactoring as much as possible!
I try to break my thinking down into two areas: a representation of the things I'm trying to manipulate, and what I intend to do with them.
When I'm trying to model the stuff I'm trying to manipulate, I come up with a series of discrete item definitions- an ecommerce site will have a SKU, a product, a customer, and so forth. I'll also have some non-material things that I'm working with- an order, or a category. Once I have all of the "nouns" in the system, I'll make a domain model that shows how these objects are related to each other- an order has a customer and multiple SKUs, many skus are grouped into a product, and so on.
These domain models can be represented as UML domain models, class diagrams, and SQL ERD's.
Once I have the nouns of the system figured out, I move on to the verbs- for instance, the operations that each of these items go through to commit an order. These usually map pretty well to use cases from my functional requirements- the easiest way to express these that I've found is UML sequence, activity, or collaboration diagrams or swimlane diagrams.
It's important to think of this as an iterative process; I'll do a little corner of the domain, and then work on the actions, and then go back. Ideally I'll have time to write code to try stuff out as I'm going along- you never want the design to get too far ahead of the application. This process is usually terrible if you think that you are building the complete and final architecture for everything; really, all you're trying to do is establish the basic foundations that the team will be sharing in common as they move through development. You're mostly creating a shared vocabulary for team members to use as they describe the system, not laying down the law for how it's gotta be done.
I find myself having trouble fully thinking a system out before coding it. It's just too easy to only bring a cursory glance to some components which you only later realize are much more complicated than you thought they were.
One solution is to just try really hard. Write UML everywhere. Go through every class. Think how it will interact with your other classes. This is difficult to do.
What I like doing is to make a general overview at first. I don't like UML, but I do like drawing diagrams which get the point across. Then I begin to implement it. Even while I'm just writing out the class structure with empty methods, I often see things that I missed earlier, so then I update my design. As I'm coding, I'll realize I need to do something differently, so I'll update my design. It's an iterative process. The concept of "design everything first, and then implement it all" is known as the waterfall model, and I think others have shown it's a bad way of doing software.
Try Archimate.

What's your Modus Operandi for solving a (programming) problem? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
While solving any programming problem, what is your modus operandi? How do you fix a problem?
Do you write everything you can about the observable behaviors of the bug or problem?
Take me through the mental checklist of actions you take.
(As they say - First, solve the problem. Then, write the code)
Step away from the computer and grab some paper and a pen or pencil if you prefer.
If I'm around the computer then I try to program a solution right then and there and it normally doesn't work right or it's just crap. Pen and paper force me to think a little more.
First, I go to one bicycle shop; or another.
Once I figure nobody invented that particular bicycle,
Figure out appropriate data structures for the domain and the problem, and then map needed algorithms for dealing with those data structures in ways you need.
Divide and conquer. Solve subsets of the problem
This algorithm has never failed me:
Take Action. Often just sitting there and being terrified or miffed by the problem will not help solve it. Also, often, no amounting of thinking will solve the problem. So you have to get your hands dirty and grapple with the problem head on.
Test. Under exactly what conditions, input values or states, does the problem occur? Make a mental model of why these particular conditions might cause the problem. Check similar conditions that don't cause the problem. Test enough so that you have a clear understanding of the problem.
Visualise. Put debug code in, dump variable contents, single step code whatever. Do anything that clarifies exactly what is going on where - within the problem conditions.
Simplify. Remove or comment code, poke values into variables, run particular functions with certain values. Try your hardest to get to the nub of the problem by cutting away the chaff or stuff that doesn't have a relevance to the problem at hand. Copy code into a separate project and run it, if you have to, to remove dependencies.
Accept. A great man said: "whatever remains, however improbable, must be the truth". In other words, after simplifying as much as you can, whatever is left must be the problem, no matter how bizarre it may seem at first.
Logic. Double, triple check the logic of the problem. Does it make sense? What would have to be true for it to make sense? Is there something you're missing? Is your understanding of the algorithm wrong? If all else fails, re-engineer the problem away.
As an adjunct to step 3, as a last resort, I often employ the binary search method of finding wayward code. Simply comment half the code and see if the problem disappears. If it does then it must be in that half (and vice versa). Half the remaining code and continue.
Google is great for searching for
error messages and common problems. Somewhere, someone has usually encountered your problem before and found a solution.
Pencil and paper. Pseudo Code and
workflow diagrams.
Talk to other developers about it. It
really helps when you have to force
yourself to simplify the problem for
someone else to understand. They may also have another angle. Sometimes it's hard to see the forest through the trees.
Go for a walk. Take your head out of
the problem. Take a step back and try
to see the bigger picture of what you
want to achieve. Make sure the problem you are 'trying' to solve is the one your 'need' to solve.
A big whiteboard is great to work on. Use it to write out workflows and relationships. Talk through what is happening with another team member
Move on. Do something else. Let your subconscious work on the problem. Allow the solution to come to you.
write down the problem
think very hard
write down the answer
I can't believe no one posted this already:
Write up your problem on StackOverflow, and let a bunch of other people solve it for you.
My method, something analytic-sinthetic:
Calm down. Take a deep breath. Focus your attention in what you're going to solve. This may include going for a walk, cleaning the whiteboard, getting scratch paper and pencils ordered, some snacks, etc. Avoid stress.
High level understanding of the problem. In case it is a bug, when does it happened? in what circumstances? If it is a new task, try to diverge of what results are needed. Recollect data, evidence, get acceptance descriptions, maybe documentation or a talk with someone that knows about the issue.
Setup the test playground. Try to feel happy with the tools needed. Use the data collected in the previous step to automate something, hopefully the bug if that's the case, some failing tests otherwise.
Start sinthesizing, summarizing what you know, reflecting that on code. Executing once and once more. If you are not happy with the results, return to step two with renewed ideas, diverge more: maybe apply tools (in order of cost) that helped before, i.e. divide and conquer, debug, multithread, dissassemble, profile, static analysis tools, metrics, etc. Get in this loop until you can isolate the problem and pass the over the phone test.
Now it's time to fix it but you have all the tools set up. It won't be so much trouble. Start writing code, apply refactoring, enjoy describing your solution in the docs.
Get someone to try your solution. She can eventually get you to step 2 but that's ok. Refine your solution and redeploy.
I'm interpreting this as fixing a bug, not a design problem.
Isolate the problem. Does it always occur? Does it occur only the first time run on a set of new data? Does it occur with specific values, but not with others?
Is the system generating any error message that appear related to the problem? Verify that the error messages are not generated when the problem does not occur.
Has anything been changed recently? Those are likely places to start looking.
Identify the gap between what I know is working (e.g. I can start up the app and attempt to do a query) and what I know is not working (e.g. it gives me an error instead of the expected results). Find an intermediate point in the code where it seems possible to look for a problem (does this contain valid data at this point?). This allows me to isolate the problem on one side or the other of the point I looked.
Read the stack traces. If you have a stack trace, find the first line that mentions in-house code. The problem is not in your libraries. Maybe it will turn out to be, but just forget about that possibly first. The error is in your code. It's not a bug in java, it's not a bug in apache commons HTTP client, it's in code written in your organization.
Think. Come up with something the system could be doing that can cause the symptoms you see. Find a way to validate whether that is what the system is doing.
No possibility the bug is in your code? Google for anything you can think of related. Maybe it is a bug in the library, or poor documentation leading you to use it wrong.
Logic.
Break the problem down, use your own brain and knowledge of each component of the system to determine exactly what is happening and why; then on the basis of this you will discover where the problem isn't, and hence determine where it must be.
I stop working on it until tomorrow. I usually solve my problem in the shower the next day. I find stepping away from the issue, and allowing my brain to clear, allows a fresh perspective on the issue.
Answer these three questions in this order:
Q1: What is the desired output?
I don't care if this is a napkin with scribble on it. I want something tangible that shows me what the end result is supposed to look like. If I don't get at least this far then I stop.
Q2: What is the input?
I find out what data I have coming in. Where this data is coming from from. What formulas I may need. What dependencies there might be on A happening before B. What permissions if any are necessary to get this data. I then ask Question 3.
Q3: Is there enough input to create the output?
If the answer is No then I go back to Q2 and get more input from whoever can give it to me.
For very large problems I break them down in Phases and apply Q1 Q2 and Q3 to each phase.
To paraphrase Douglas Adams, programming is easy. You only need to stare at a blank screen until your forehead bleeds. For people who are squeamish about their foreheads, my ideal architect-and-build for the bigger problems would go something like this. (For smaller problems, like George Jempty I can only recommend Feynman's Algorithm.)
What I write is couched in an on-site business setting but there are analogues in open-source or distributed teams. And I can't pretend that every, or even most, projects pan out this way. This is just the series of events that I dream about, and occasionally come to pass.
Get advanced, concise warning of what the problem is likely to look like. This is not the full, final meeting, but an informal discussion. Uncertainty in certain specification details is fine, as long as the client (or manager) is honest. Then take a piece of paper or text editor, and try to condense what you've learned down to five essential points, and then try to condense those to a single sentence. Be happy you can picture the core problem(s) to be solved without referencing any of your documentation.
Think about it for maybe a couple of hours, maybe playing with code and prototyping, but not with a view to the full architecture: you should even do other stuff, if you've time, or go for a walk. It's great if you can learn about a job an hour before home time in order to deliver a decision around midday the next day, so you get to sleep on it. Spend your time looking at potential libraries, frameworks, data standards. Try to tie together at least two languages or resources (say, Javascript on PHP-generated HTML; or get a Python stub talking RPC to a web service). Flesh out the core problems; zoom in on the details; zoom out to make sure the whole shape is still distinct and makes sense.
Send any questions to the client or manager well in advance of a meeting to discuss both the problem and your proposed solution. Invite as many stakeholders and your programming peers along as is convenient (and as your manager is happy with.) Explain the problem back to them, as you see it, then propose your solution. Explain as much as you can; pitch the technical details at your audience, but also let your explanations fill in more details in your own mental model.
Iterate on 2 and 3 until everyone is happy. Happiness is domain-specific. Your industry might require UML diagrams and line-item quotations, or it might be happy with something jotted on a whiteboard with an almost invisible drywipe marker. Make sure everyone has the same expectations of what you're about to build.
When your client or manager is happy for you to start, clear everything. Close Twitter, instant messenger, IRC and email for an hour or two. Start with the overall structure as you see it. Drop some of your prototype code in and see if it feels right. If it doesn't, change the structure as early as possible. But most of all make sure your colleagues give you a couple of hours of space. Try not to fight fires in this time. Begin with a good heart and cheer, and interest in the project. When you're bogged down later on you'll be glad of the clarity that came out of those first few hours.
How your programming proceeds from there depends on what it actually is, and what tasks the finished code needs to perform. And how you ultimately architect your code, and what external resources you use, will always be dictated by your experience, preference and domain knowledge. But give your project and its stakeholder team the most hopeful, most exciting and most engaged start you can.
Pencil, paper and a whiteboard. If you need more organization, use a tool like MindManager.
Andy Hunt's Pragmatic Thinking and Learning has a lot to say on this question.
Question: How do you eat an elephant?
Answer: One bite at a time.
One technique I like using for really big projects is to get into a room with a whiteboard and a pile of square Post-it Notes.
Write your tasks on the Post-it Notes then start sticking them on the whiteboard.
As you go, you can replace tasks that are too big with multiple notes.
You can shift notes around to change the order that the tasks happen in.
Use different colours to indicate different information; I sometimes use a different colour to indicate stuff that we need to do more research on.
This is a great technique for working with a team. Everybody can see the big picture and can contribute in a highly interactive way.
I think about it. I take anywhere from a couple minutes to a few weeks to mull over the problem and develop a general plan of attack.
Hammer out an initial solution. This solution is probably half-baked and one or more aspects may not work.
Refine that solution. Keep working on the problem till i have something that solves the problem.
(and this may be done at any step in the process) Ask questions on stack overflow to clear up any difficulties i'm having at the moment.
One of my ex-colleagues had a unique Modus Operandi. Whenever faced with a hard programming problem (e.g. Knapsack problem or some kind of non-standard optimization problem) he would get stoned on weed, claiming his ability to visualize complex state (such as that of recursive function doing operations on set passed on the stack) was greatly improved. The only difficulty, the next day he could not understand his own code. So eventually I showed him TDD and he has quit smoking...
I write it on a piece of paper and start with my horrible class diagram or flowchart. Then I write it on sticky notes to break it down to "TO DO's".
1 sticky note = 1 task. 1 dumped sticky note = 1 finished task. This works really well for me so far.
Add the problem to StackOverflow, wait about 5-10 minutes and you usually have a brilliant solution! :)
The following applies to a bug rather than building a project from scratch (but even then it could do both if reworded a bit):
Context: What is the problem at hand? What is it preventing, doing wrong, or not doing?
Control: What variables (in the wide sense of the word) are involved? Can the problem be reproduced?
Hypothesise: With enough data on what is occurring or required, it is possible to hypothesise, that is, to draw a mental image of the problem in question.
Evaluate: How much effort, cost, etc, will the correction require? Determine if it's a show stopper or a minor irritant. At this point, it may be too early to tell, but even that is a form of evaluation. This will allow prioritisation.
Plan: How will the problem be approached? Does it require specifications? If so, do them first.
Execute: A.K.A. The fun part.
Test: A.K.A. The not-so-fun-part.
Repeat to satisfaction. Finally:
Feedback: how did it come to be this way? What lead us here? Could this have been prevented, and if so, how?
EDIT:
Really summarised, stop, analyse, act.
Probably a gross oversimplification:
But really, this holds 100% true.
CONCEIVE
What are you without an idea? You may have a problem, but first you must define it more explicitly. You have a frozen pizza that you want to eat. You need to cook that pizza! In programming, this is usually your brainstorming session for coming up with a solution from the hip. Here you decide what your approach is.
PLAN
Well, of course you need to cook that pizza! But HOW! Will you use the oven? No. Too easy. You want to build a solar cooker, so you can eat that frozen pizza anywhere that the sun grants you power to do so. This is your design phase. This is your pencil and paper phase. This is where you start to form a cohesive, step-by-step method to implementation.
EXECUTE
Well, you are going to build a solar oven to cook your frozen pizza; you've decided. NOW DO IT. Write code. Test. Commit. Refactor. Commit.
Related question that may be useful:
Helpful points of view, concepts or ways to think about problems every newbie should know
Every problem I've ever had to solve on a computer has had something to do with solving a task in the real world. Therefore, I've learned to look at how I would accomplish something in the real world and map that to the computer problem.
Example:
I need to keep track of my student's grades and come up with a final grade that is an average of all the grades throughout the year?
Well, I'd save the grades in a log (database) and I'd have a page for every student (Field StudentID) and so on...
I always take a problem to a blog first. Stackoverflow would be a good place to start. Why waste your time re-inventing the wheel when someone else may have already solved a similar problem in the past? If anything you will get some good ideas to solve it yourself.
I use the scientific method:
Based on the available information about the programming problem I come up with a hypothesis about what the reason could be.
Then I design / think up an experiment that will reject or confirm the hypothesis. This could be observing something in a debugger or screen/file output. Or changing the program slightly.
If the hypothesis is rejected then repeat 1. The information gathered in 2. may help in coming up with a new hypothesis.
If the hypothesis is confirmed then the hypothesis may be refined/become more specific (repeat 1.). Or it may already be clear what the problem is.
This directed way of find the problem is much more effective than changing things at random, observe what happens and try to (inappropriately) use statistics.
No one has mentioned truth tables! But that's probably because they're usually only mildly helpful ;) (although your mileage may vary) I used one for the first time yesterday in my 8 years of programming.
Diagramming on whiteboards or paper have always been very helpful for me.
When faced with very weird bugs. Like this: JPA stops working after redeploy in glassfish
I start from scratch. Make a new project. Does it work? Yes. Start to recreate the components of my app one piece of a time. DB. Check. Deploy. Check. Until it breaks. Continue until it breaks. If it never breaks. Ok. You just recreated your entire app. Discard of the old one. When it breaks. You pinpointed the exact problem.
I think - what am i looking for?
What method best solves this problem?
Implement it with solid logic - no code
Pseudo code
code a rough cut
execute
These is my prioritized methods
Analyse
a. Try to find the source of your problem
b. Define desired outcome
c. Brainstorm about solutions
Try on error (If I dont want to analyse)
Google a bit around
a. Of course, look around on stackoverflow
When you get mad, walk away from pc for a cup of coffee
When you still mad after 10 cups of coffee, Go sleep a night to think about the problem
GOLDEN TIP
Never give up. Persistence will always win

The best way to familiarize yourself with an inherited codebase

Stacker Nobody asked about the most shocking thing new programmers find as they enter the field.
Very high on the list, is the impact of inheriting a codebase with which one must rapidly become acquainted. It can be quite a shock to suddenly find yourself charged with maintaining N lines of code that has been clobbered together for who knows how long, and to have a short time in which to start contributing to it.
How do you efficiently absorb all this new data? What eases this transition? Is the only real solution to have already contributed to enough open-source projects that the shock wears off?
This also applies to veteran programmers. What techniques do you use to ease the transition into a new codebase?
I added the Community-Building tag to this because I'd also like to hear some war-stories about these transitions. Feel free to share how you handled a particularly stressful learning curve.
Pencil & Notebook ( don't get distracted trying to create a unrequested solution)
Make notes as you go and take an hour every monday to read thru and arrange the notes from previous weeks
with large codebases first impressions can be deceiving and issues tend to rearrange themselves rapidly while you are familiarizing yourself.
Remember the issues from your last work environment aren't necessarily valid or germane in your new environment. Beware of preconceived notions.
The notes/observations you make will help you learn quickly what questions to ask and of whom.
Hopefully you've been gathering the names of all the official (and unofficial) stakeholders.
One of the best ways to familiarize yourself with inherited code is to get your hands dirty. Start with fixing a few simple bugs and work your way into more complex ones. That will warm you up to the code better than trying to systematically review the code.
If there's a requirements or functional specification document (which is hopefully up-to-date), you must read it.
If there's a high-level or detailed design document (which is hopefully up-to-date), you probably should read it.
Another good way is to arrange a "transfer of information" session with the people who are familiar with the code, where they provide a presentation of the high level design and also do a walk-through of important/tricky parts of the code.
Write unit tests. You'll find the warts quicker, and you'll be more confident when the time comes to change the code.
Try to understand the business logic behind the code. Once you know why the code was written in the first place and what it is supposed to do, you can start reading through it, or as someone said, prolly fixing a few bugs here and there
My steps would be:
1.) Setup a source insight( or any good source code browser you use) workspace/project with all the source, header files, in the code base. Browsly at a higher level from the top most function(main) to lowermost function. During this code browsing, keep making notes on a paper/or a word document tracing the flow of the function calls. Do not get into function implementation nitti-gritties in this step, keep that for a later iterations. In this step keep track of what arguments are passed on to functions, return values, how the arguments that are passed to functions are initialized how the value of those arguments set modified, how the return values are used ?
2.) After one iteration of step 1.) after which you have some level of code and data structures used in the code base, setup a MSVC (or any other relevant compiler project according to the programming language of the code base), compile the code, execute with a valid test case, and single step through the code again from main till the last level of function. In between the function calls keep moting the values of variables passed, returned, various code paths taken, various code paths avoided, etc.
3.) Keep repeating 1.) and 2.) in iteratively till you are comfortable up to a point that you can change some code/add some code/find a bug in exisitng code/fix the bug!
-AD
I don't know about this being "the best way", but something I did at a recent job was to write a code spider/parser (in Ruby) that went through and built a call tree (and a reverse call tree) which I could later query. This was slightly non-trivial because we had PHP which called Perl which called SQL functions/procedures. Any other code-crawling tools would help in a similar fashion (i.e. javadoc, rdoc, perldoc, Doxygen etc.).
Reading any unit tests or specs can be quite enlightening.
Documenting things helps (either for yourself, or for other teammates, current and future). Read any existing documentation.
Of course, don't underestimate the power of simply asking a fellow teammate (or your boss!) questions. Early on, I asked as often as necessary "do we have a function/script/foo that does X?"
Go over the core libraries and read the function declarations. If it's C/C++, this means only the headers. Document whatever you don't understand.
The last time I did this, one of the comments I inserted was "This class is never used".
Do try to understand the code by fixing bugs in it. Do correct or maintain documentation. Don't modify comments in the code itself, that risks introducing new bugs.
In our line of work, generally speaking we do no changes to production code without good reason. This includes cosmetic changes; even these can introduce bugs.
No matter how disgusting a section of code seems, don't be tempted to rewrite it unless you have a bugfix or other change to do. If you spot a bug (or possible bug) when reading the code trying to learn it, record the bug for later triage, but don't attempt to fix it.
Another Procedure...
After reading Andy Hunt's "Pragmatic Thinking and Learning - Refactor Your Wetware" (which doesn't address this directly), I picked up a few tips that may be worth mentioning:
Observe Behavior:
If there's a UI, all the better. Use the app and get a mental map of relationships (e.g. links, modals, etc). Look at HTTP request if it helps, but don't put too much emphasis on it -- you just want a light, friendly acquaintance with app.
Acknowledge the Folder Structure:
Once again, this is light. Just see what belongs where, and hope that the structure is semantic enough -- you can always get some top-level information from here.
Analyze Call-Stacks, Top-Down:
Go through and list on paper or some other medium, but try not to type it -- this gets different parts of your brain engaged (build it out of Legos if you have to) -- function-calls, Objects, and variables that are closest to top-level first. Look at constants and modules, make sure you don't dive into fine-grained features if you can help it.
MindMap It!:
Maybe the most important step. Create a very rough draft mapping of your current understanding of the code. Make sure you run through the mindmap quickly. This allows an even spread of different parts of your brain to (mostly R-Mode) to have a say in the map.
Create clouds, boxes, etc. Wherever you initially think they should go on the paper. Feel free to denote boxes with syntactic symbols (e.g. 'F'-Function, 'f'-closure, 'C'-Constant, 'V'-Global Var, 'v'-low-level var, etc). Use arrows: Incoming array for arguments, Outgoing for returns, or what comes more naturally to you.
Start drawing connections to denote relationships. Its ok if it looks messy - this is a first draft.
Make a quick rough revision. Its its too hard to read, do another quick organization of it, but don't do more than one revision.
Open the Debugger:
Validate or invalidate any notions you had after the mapping. Track variables, arguments, returns, etc.
Track HTTP requests etc to get an idea of where the data is coming from. Look at the headers themselves but don't dive into the details of the request body.
MindMap Again!:
Now you should have a decent idea of most of the top-level functionality.
Create a new MindMap that has anything you missed in the first one. You can take more time with this one and even add some relatively small details -- but don't be afraid of what previous notions they may conflict with.
Compare this map with your last one and eliminate any question you had before, jot down new questions, and jot down conflicting perspectives.
Revise this map if its too hazy. Revise as much as you want, but keep revisions to a minimum.
Pretend Its Not Code:
If you can put it into mechanical terms, do so. The most important part of this is to come up with a metaphor for the app's behavior and/or smaller parts of the code. Think of ridiculous things, seriously. If it was an animal, a monster, a star, a robot. What kind would it be. If it was in Star Trek, what would they use it for. Think of many things to weigh it against.
Synthesis over Analysis:
Now you want to see not 'what' but 'how'. Any low-level parts that through you for a loop could be taken out and put into a sterile environment (you control its inputs). What sort of outputs are you getting. Is the system more complex than you originally thought? Simpler? Does it need improvements?
Contribute Something, Dude!:
Write a test, fix a bug, comment it, abstract it. You should have enough ability to start making minor contributions and FAILING IS OK :)! Note on any changes you made in commits, chat, email. If you did something dastardly, you guys can catch it before it goes to production -- if something is wrong, its a great way to get a teammate to clear things up for you. Usually listening to a teammate talk will clear a lot up that made your MindMaps clash.
In a nutshell, the most important thing to do is use a top-down fashion of getting as many different parts of your brain engaged as possible. It may even help to close your laptop and face your seat out the window if possible. Studies have shown that enforcing a deadline creates a "Pressure Hangover" for ~2.5 days after the deadline, which is why deadlines are often best to have on a Friday. So, BE RELAXED, THERE'S NO TIMECRUNCH, AND NOW PROVIDE YOURSELF WITH AN ENVIRONMENT THAT'S SAFE TO FAIL IN. Most of this can be fairly rushed through until you get down to details. Make sure that you don't bypass understanding of high-level topics.
Hope this helps you as well :)
All really good answers here. Just wanted to add few more things:
One can pair architectural understanding with flash cards and re-visiting those can solidify understanding. I find questions such as "Which part of code does X functionality ?", where X could be a useful functionality in your code base.
I also like to open a buffer in emacs and start re-writing some parts of the code base that I want to familiarize myself with and add my own comments etc.
One thing vi and emacs users can do is use tags. Tags are contained in a file ( usually called TAGS ). You generate one or more tags files by a command ( etags for emacs vtags for vi ). Then we you edit source code and you see a confusing function or variable you load the tags file and it will take you to where the function is declared ( not perfect by good enough ). I've actually written some macros that let you navigate source using Alt-cursor,
sort of like popd and pushd in many flavors of UNIX.
BubbaT
The first thing I do before going down into code is to use the application (as several different users, if necessary) to understand all the functionalities and see how they connect (how information flows inside the application).
After that I examine the framework in which the application was built, so that I can make a direct relationship between all the interfaces I have just seen with some View or UI code.
Then I look at the database and any database commands handling layer (if applicable), to understand how that information (which users manipulate) is stored and how it goes to and comes from the application
Finally, after learning where data comes from and how it is displayed I look at the business logic layer to see how data gets transformed.
I believe every application architecture can de divided like this and knowning the overall function (a who is who in your application) might be beneficial before really debugging it or adding new stuff - that is, if you have enough time to do so.
And yes, it also helps a lot to talk with someone who developed the current version of the software. However, if he/she is going to leave the company soon, keep a note on his/her wish list (what they wanted to do for the project but were unable to because of budget contraints).
create documentation for each thing you figured out from the codebase.
find out how it works by exprimentation - changing a few lines here and there and see what happens.
use geany as it speeds up the searching of commonly used variables and functions in the program and adds it to autocomplete.
find out if you can contact the orignal developers of the code base, through facebook or through googling for them.
find out the original purpose of the code and see if the code still fits that purpose or should be rewritten from scratch, in fulfillment of the intended purpose.
find out what frameworks did the code use, what editors did they use to produce the code.
the easiest way to deduce how a code works is by actually replicating how a certain part would have been done by you and rechecking the code if there is such a part.
it's reverse engineering - figuring out something by just trying to reengineer the solution.
most computer programmers have experience in coding, and there are certain patterns that you could look up if that's present in the code.
there are two types of code, object oriented and structurally oriented.
if you know how to do both, you're good to go, but if you aren't familiar with one or the other, you'd have to relearn how to program in that fashion to understand why it was coded that way.
in objected oriented code, you can easily create diagrams documenting the behaviors and methods of each object class.
if it's structurally oriented, meaning by function, create a functions list documenting what each function does and where it appears in the code..
i haven't done either of the above myself, as i'm a web developer it is relatively easy to figure out starting from index.php to the rest of the other pages how something works.
goodluck.