Related
I'd like to start developing a "simple" game with HTML5 and I'm quite confused by the many resources I found online. I have a solid background in development, but in completely different environments (ironically, I started programming because I wanted to become a game developer, and it's the only thing I've never done in 13 years...).
The confusion derives from the fact that, although I know JavaScript very well and I have some knowledge of HTML5, I can't figure out how to mix what I know with all this new stuff. For example, here's what I was thinking of:
The game would be an implementation of chess. I have some simple "ready made" AI algorithm that I can reuse for single player; the purpose here is to learn HTML5 game development, so this part is not very important at the moment.
I'd like build a website around the game. For this I'd use a "regular" CMS, as I know many of them already and it would be faster to put it up.
Then I'd have the game itself, which, in its "offline" version, has nothing to do with the website, as, as far as I understand, it would live in a page by itself. This is the first question: how to make the Game aware of User's session? The login would be handled by the CMS (it should be much easier this way, as User Managememt is already implemented).
As a further step, I'd like to move the AI to the server. This is the second question: how do I make the game send player's actions to the Server, and how do I get the answer back?
Later on, I'd like to bring a PVP element to the game, i.e. one-against-one multiplayer (like the good old chess). This is the third question: how to send information from a client to another, and keep the conversation going on. For this, people recommended me to have a look at Node.js, but it's one more element that I can't figure out how to "glue" to the rest.
Here's an example of a single action in a PVP session, which already gives me a headache: Player 1 sends his move to the Server (how does the game talk to Node.js?). I'd need to identify the Game Id (where and how should I store it?), and make sure the player hasn't manually modified it, so it won't interfere with someone else's game (how?).
I'm aware that the whole thing, as I wrote it, is very messy, but that's precisely how I feel at the moment. I can't figure out where to start, therefore any suggestion is extremely welcome.
Too many things and probably in the wrong order.
A lot of the issues don't seem to me to be particularly related to HTML5 in the first instance.
Start with the obvious thing - you want a single page (basically a javascript application) that plays chess, so build that. If you can't build that then the rest is substantially irrelevant, if you can build (and I don't doubt that you can) then the rest is about building on that capability.
So we get to your first question - well at the point at which you load the page you will have the session, its a web page, like any other web page, so that's how you get the session. If you're offline then you've persisted that from when you were online by whatever means - presumably local storage.
You want to move the AI to the server? Ok, so make sure that the front end user interaction talks to an "interface" to record the player moves and retrieve the AI moves. Given this separation you can replaces the AI on the client with an ajax (although I'd expect the x to be json!) call to the server with the same parameters.
This gets better, if you want to do player to player you're just talking about routing through the server from one user/player to another user/player - the front end code doesn't have to change, just what the server does at the far end of the ajax call.
But for all this, take a step back and solve the problems one at a time - if you do that you should arrive where you want to go without driving yourself nuts trying to worry about a bucket full of problems that seem scary that you can probably easily solve one at a time and I'd start by getting your game to run, all on its own, in the browser.
About question one: You could maybe give the user a signed cookie. E.g. create a cookie that contains his userid or so and the SHA2 hash of his userid plus a secret, long salt (e.g. 32 bytes salt or so).
About question two: For exchanging stuff and calling remote functions, I'd use the RPC library dnode.
About question three: Use the same thing for calling methods between clients.
Client code (just an example):
DNode.connect(function (remote) {
this.newPeer = function(peer) {
peer.sendChatMessage("Hello!");
};
});
You don't have to use game IDs if you use dnode - just hand functions to the browser that are bound to the game. If you need IDs for some reason, use a UUID module to create long, random ones - they're unguessable.
I am building a Flash AS3 application that allows users to modify images, (drag and drop, select, scale, alter saturation, etc) and then submit-save them to a server.
The user will then have the ability to log in and access these saved images via a separate admin tool in a thumbnail gallery. They can either delete an image or click a thumbnail to view it at original size.
I am architecting and building the front end only and will have design ready assets supplied.
Since I have been burned in working to fixed quote before, would appreciate ANY feedback advice on quoting this project!
Thanks in advance!
I've done a lot of estimating, and I've found that the only way I can get a reliable estimate is to break down all of the tasks and subtasks to as granular a level as I can, estimate all of those elements, and then add it up. This usually takes me several passes, and a couple of times waking up in the middle of the night.
It's time-intensive, but works out really well in at least three ways.
Obviously, the first way is that you end up with a pretty reliable estimate.
You also think of all kinds of things that you wouldn't have thought of if you hadn't sat down and wrote everything out (which is a big part of why estimates turn out to be wrong, in the first place). You also give yourself the chance to really think through your overall approach, and you end up making better decisions on things like which framework to use.
Writing everything out to the detail level helps a lot in sequencing the work you're doing with the work of other teammates. Makes it easy to see that at a given point you'll be roadblocked if you don't have an API from the server team, etc. Also helps you realize how you will potentially roadblock your teammates, and gives you the ability to deal with that.
Hope that's helpful. Making myself work hard at the estimation end of a project has really helped me be successful in the actual development aspect.
My coworkers and I were having a discussion about this yesterday. It seems that no matter how well we prepare and no matter how much we test and no matter what the client says immediately before the site becomes public, initial site launches almost always seem to be somewhat rocky. Some clients are better than others, but often things that were just fine during testing suddenly go horribly wrong when the site becomes public.
Is this a common experience? I'm not just talking about functionality breaking down (although that's often a problem as well). I'm also talking about sites that work exactly the way we wanted them to, but suddenly are not satisfactory to the client when it's time to make the site public. And I'm talking about clients that have been familiar with the site during most of the development process. Meaning, the public launch is definitely not the first time they've seen the site.
If you've dealt with this problem before, have you found a way to improve the situation? Or is this just something that will always be somewhat of a problem?
Don't worry. This is completely and entirely normal and happens with every piece of software. Everything that can go wrong will go wrong, and the most volatile entity in the development process, the client, will be the cause of these things.
You could do all the Requirements Gathering in the world, write a 100 page Proposal, provide screenshots and updates to the project hourly and the client will still not approve. On a personal note, I feel that the Internet is one of the worst mediums for this, as designs are a lot more free-flowing nowadays and the client will always have a certain picture in his/her mind; one that won't look like the finished product.
I find that a bulletproof contact with defined stages and sign-off sheets are the best way to handle such a situation. Assuming that your work is contracted you should ensure that at each stage the client is shown the work and is forced to approve each and every change made. At least that way if the client wants something changed you can tell them that they've already signed off that section and the additional work will cost them extra (also defined within the contract).
Not only did this approach work for me, it made the client stop and think about what he/she REALLY wanted. Luckily for me many of my clients are already tech-oriented, so they understand that these things can take time, but those that haven't a clue about Web Development expect things to be perfect within a couple of days. As long as you make sure that everything is covered in the contract the client will think about what they want and won't pester you with issues after.
Of course, anything you can do in regards to Quality Control would be fantastic and help the project move along nicely. Also ensure that some form of methodology is planned out before the project and that this methodology is known by the client(s). Often changes in fundamental areas can be costly and many clients do not seem to realise that a small change can require many things to be changed.
Yes, saw this several times on our projects (human beings are fickle).
Something that helps us in these situations is a good PM/Account Manager that can handle the customer, which makes things a little bit bearable on the technical level.
Web site launches are usually fairly smooth for us. Of course, we do extensive validation including code inspections, deployments to proto-servers (identical to our production servers), and mountains of documentation.
After every launch, we have a meeting to discuss what went well and what didn't so that we can make adjustments to our overall process and best-known-methods documents.
As for clients that change their minds at the last minute... sigh... we minimize that by having them sign off on the beta version. That way, there is no disagreement when the project is launched. If there is a disagreement, there is always a next release.
For what it's worth, the last site launch I did went off without a hitch. Now, it wasn't a high-traffic site, and there were some bugs that I did eventually fix, but there wasn't anything troubling on the day of the actual launch.
This was an ASP.NET/C# site. It wasn't terribly large or complicated, but it wasn't trivial either. Probably the most notable thing is that it was 100% designed, implemented, and tested by myself, from the database schema all the way up to the CSS. It was also my first time using ASP.NET. There were plenty of bumps in development but by the time I launched it I was pretty familiar with them and so knew what to expect.
I think the lesson to be learned from this is to have a good design up-front, solid implementation skills, and good testing, and a new site doesn't have to be a nightmare. There's at least a possibility of a trouble-free launch.
I wouldn't limit your statement to just web sites. I have worked on a lot of projects over the years and there are always details that get "discovered" when going live. No amount of testing removes all the fun things that can happen.
One thing I will say is what you learn in the first couple of hours of a new system going "on-line" is way move valuable that all the stuff learned during development. It's show time when the real cool problems and scenarios appear. Learn to love them and use these times as a learning point for the next time. Then each time it will be just at fun!
We used to have this problem a lot, but much less recently.
Partly for us it is about firmer project management and documenting the specification (as suggested in other answers here) but I believe more of the difference came from:
Expectation management - getting the client to accept that iterative changes are still to be expected after launch, that this is normal and not to worry about it
Increasing authority - we are now a well established (13 years) web developer and we can speak with a lot of expertise
Simply being more experienced - we can now predict in advance most of the queries that are likely to come up, and either resolve them, mitigate them or bring them to the client's attention so they don't sting us on the day
Plus, we rarely do big fanfare launches - a soft launch makes things much less stressful.
My experience is that web site launches are almost always rocky. I've only had two exceptions to this common truth. The first was a site developed for a small business ran by one person. This went smoothly because, well there was only one person to please so is was fairly easy to track what they wanted. The other was a multi-million dollar website launched by a fortune 500 company. This happened to go smoothly because there were 2 PMs and a small army of consultants there to manage the needs of the customer. That coupled with a one month of straight application load testing and a 1,000 user beta launch meant when the site finally went "live", I was able to get a full nights sleep (which is fairly uncommon). Neither of these situations constitute then norm though. Of course, there's nothing better than several thousand beta testers hitting your site to help find those contingencies that you never thought of.
I'm sure you can figure out the kind of errors that always sneak in, so for example is it due to rather superficial testing? E.g. randomly clicking around and checking if things appear to be right.
In order to improve I propose something along the following:
Create documents/checklists that specify all testing procedures.
Get regular people to test, not just the folks who built the application.
Setup a staging environment which closely resembles production.
Post-launch, analyze what went wrong and why it went wrong.
Maybe get external QA to check on your procedures.
Now, all those suggestions are of course very obvious but implementing them into your launch procedures will require time.
In general this really is an ongoing process which will help you and your colleagues to improve. And also be happier, because fixing bugs in production just makes you age rapidly. ;-)
Keep in mind, you won't be done the first time. Documents are heavy which is why people don't read them. People are also lazy and don't follow the procedures. This means that you always have analyze what happened, go back and improve the procedures.
If you have the opportunity I'd also spend some time on looking why nothing went wrong with another launch and comparing this to the usual.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Joining an existing team with a large codebase already in place can be daunting. What's the best approach;
Broad; try to get a general overview of how everything links together, from the code
Narrow; focus on small sections of code at a time, understanding how they work fully
Pick a feature to develop and learn as you go along
Try to gain insight from class diagrams and uml, if available (and up to date)
Something else entirely?
I'm working on what is currently an approx 20k line C++ app & library (Edit: small in the grand scheme of things!). In industry I imagine you'd get an introduction by an experienced programmer. However if this is not the case, what can you do to start adding value as quickly as possible?
--
Summary of answers:
Step through code in debug mode to see how it works
Pair up with someone more familiar with the code base than you, taking turns to be the person coding and the person watching/discussing. Rotate partners amongst team members so knowledge gets spread around.
Write unit tests. Start with an assertion of how you think code will work. If it turns out as you expected, you've probably understood the code. If not, you've got a puzzle to solve and or an enquiry to make. (Thanks Donal, this is a great answer)
Go through existing unit tests for functional code, in a similar fashion to above
Read UML, Doxygen generated class diagrams and other documentation to get a broad feel of the code.
Make small edits or bug fixes, then gradually build up
Keep notes, and don't jump in and start developing; it's more valuable to spend time understanding than to generate messy or inappropriate code.
this post is a partial duplicate of the-best-way-to-familiarize-yourself-with-an-inherited-codebase
Start with some small task if possible, debug the code around your problem.
Stepping through code in debug mode is the easiest way to learn how something works.
Another option is to write tests for the features you're interested in. Setting up the test harness is a good way of establishing what dependencies the system has and where its state resides. Each test starts with an assertion about the way you think the system should work. If it turns out to work that way, you've achieved something and you've got some working sample code to reproduce it. If it doesn't work that way, you've got a puzzle to solve and a line of enquiry to follow.
One thing that I usually suggest to people that has not yet been mentioned is that it is important to become a competent user of the existing code base before you can be a developer. When new developers come into our large software project, I suggest that they spend time becoming expert users before diving in trying to work on the code.
Maybe that's obvious, but I have seen a lot of people try to jump into the code too quickly because they are eager to start making progress.
This is quite dependent on what sort of learner and what sort of programmer you are, but:
Broad first - you need an idea of scope and size. This might include skimming docs/uml if they're good. If it's a long term project and you're going to need a full understanding of everything, I might actually read the docs properly. Again, if they're good.
Narrow - pick something manageable and try to understand it. Get a "taste" for the code.
Pick a feature - possibly a different one to the one you just looked at if you're feeling confident, and start making some small changes.
Iterate - assess how well things have gone and see if you could benefit from repeating an early step in more depth.
Pairing with strict rotation.
If possible, while going through the documentation/codebase, try to employ pairing with strict rotation. Meaning, two of you sit together for a fixed period of time (say, a 2 hour session), then you switch pairs, one person will continue working on that task while the other moves to another task with another partner.
In pairs you'll both pick up a piece of knowledge, which can then be fed to other members of the team when the rotation occurs. What's good about this also, is that when a new pair is brought together, the one who worked on the task (in this case, investigating the code) can then summarise and explain the concepts in a more easily understood way. As time progresses everyone should be at a similar level of understanding, and hopefully avoid the "Oh, only John knows that bit of the code" syndrome.
From what I can tell about your scenario, you have a good number for this (3 pairs), however, if you're distributed, or not working to the same timescale, it's unlikely to be possible.
I would suggest running Doxygen on it to get an up-to-date class diagram, then going broad-in for a while. This gives you a quickie big picture that you can use as you get up close and dirty with the code.
I agree that it depends entirely on what type of learner you are. Having said that, I've been at two companies which had very large code-bases to begin with. Typically, I work like this:
If possible, before looking at any of the functional code, I go through unit tests that are already written. These can generally help out quite a lot. If they aren't available, then I do the following.
First, I largely ignore implementation and look only at header files, or just the class interfaces. I try to get an idea of what the purpose of each class is. Second, I go one level deep into the implementation starting with what seems to be the area of most importance. This is hard to gauge, so occasionally I just start at the top and work my way down in the file list. I call this breadth-first learning. After this initial step, I generally go depth-wise through the rest of the code. The initial breadth-first look helps to solidify/fix any ideas I got from the interface level, and then the depth-wise look shows me the patterns that have been used to implement the system, as well as the different design ideas. By depth-first, I mean you basically step through the program using the debugger, stepping into each function to see how it works, and so on. This obviously isn't possible with really large systems, but 20k LOC is not that many. :)
Work with another programmer who is more familiar with the system to develop a new feature or to fix a bug. This is the method that I've seen work out the best.
I think you need to tie this to a particular task. When you have time on your hands, go for whichever approach you are in the mood for.
When you have something that needs to get done, give yourself a narrow focus and get it done.
Get the team to put you on bug fixing for two weeks (if you have two weeks). They'll be happy to get someone to take responsibility for that, and by the end of the period you will have spent so much time problem-solving with the library that you'll probably know it pretty well.
If it has unit tests (I'm betting it doesn't). Start small and make sure the unit tests don't fail. If you stare at the entire codebase at once your eyes will glaze over and you will feel overwhelmed.
If there are no unit tests, you need to focus on the feature that you want. Run the app and look at the results of things that your feature should affect. Then start looking through the code trying to figure out how the app creates the things you want to change. Finally change it and check that the results come out the way you want.
You mentioned it is an app and a library. First change the app and stick to using the library as a user. Then after you learn the library it will be easier to change.
From a top down approach, the app probably has a main loop or a main gui that controls all the action. It is worth understanding the main control flow of the application. It is worth reading the code to give yourself a broad overview of the main flow of the app. If it is a GUI app, creating a paper that shows which screens there are and how to get from one screen to another. If it is a command line app, how the processing is done.
Even in companies it is not unusual to have this approach. Often no one fully understands how an application works. And people don't have time to show you around. They prefer specific questions about specific things so you have to dig in and experiment on your own. Then once you get your specific question you can try to isolate the source of knowledge for that piece of the application and ask it.
Start by understanding the 'problem domain' (is it a payroll system? inventory? real time control or whatever). If you don't understand the jargon the users use, you'll never understand the code.
Then look at the object model; there might already be a diagram or you might have to reverse engineer one (either manually or using a tool as suggested by Doug). At this stage you could also investigate the database (if any), if should follow the object model but it may not, and that's important to know.
Have a look at the change history or bug database, if there's an area that comes up a lot, look into that bit first. This doesn't mean that it's badly written, but that it's the bit everyone uses.
Lastly, keep some notes (I prefer a wiki).
The existing guys can use it to sanity check your assumptions and help you out.
You will need to refer back to it later.
The next new guy on the team will really thank you.
I had a similar situation. I'd say you go like this:
If its a database driven application, start from the database and try to make sense of each table, its fields and then its relation to the other tables.
Once fine with the underlying store, move up to the ORM layer. Those table must have some kind of representation in code.
Once done with that then move on to how and where from these objects are coming from. Interface? what interface? Any validations? What preprocessing takes place on them before they go to the datastore?
This would familiarize you better with the system. Remember that trying to write or understand unit tests is only possible when you know very well what is being tested and why it needs to be tested in only that way.
And in case of a large application that is not driven towards databases, I'd recommend an other approach:
What the main goal of the system?
What are the major components of the system then to solve this problem?
What interactions each of the component has among them? Make a graph that depicts component dependencies. Ask someone already working on it. These componentns must be exchanging something among each other so try to figure out those as well (like IO might be returning File object back to GUI and like)
Once comfortable to this, dive into component that is least dependent among others. Now study how that component is further divided into classes and how they interact wtih each other. This way you've got a hang of a single component in total
Move to the next least dependent component
To the very end, move to the core component that typically would have dependencies on many of the other components which you've already tackled
While looking at the core component, you might be referring back to the components you examined earlier, so dont worry keep working hard!
For the first strategy:
Take the example of this stackoverflow site for instance. Examine the datastore, what is being stored, how being stored, what representations those items have in the code, how an where those are presented on the UI. Where from do they come and what processing takes place on them once they're going back to the datastore.
For the second one
Take the example of a word processor for example. What components are there? IO, UI, Page and like. How these are interacting with each other? Move along as you learn further.
Be relaxed. Written code is someone's mindset, froze logic and thinking style and it would take time to read that mind.
First, if you have team members available who have experience with the code you should arrange for them to do an overview of the code with you. Each team member should provide you with information on their area of expertise. It is usually valuable to get multiple people explaining things, because some will be better at explaining than others and some will have a better understanding than others.
Then, you need to start reading the code for a while without any pressure (a couple of days or a week if your boss will provide that). It often helps to compile/build the project yourself and be able to run the project in debug mode so you can step through the code. Then, start getting your feet wet, fixing small bugs and making small enhancements. You will hopefully soon be ready for a medium-sized project, and later, a big project. Continue to lean on your team-mates as you go - often you can find one in particular who is willing to mentor you.
Don't be too hard on yourself if you struggle - that's normal. It can take a long time, maybe years, to understand a large code base. Actually, it's often the case that even after years there are still some parts of the code that are still a bit scary and opaque. When you get downtime between projects you can dig in to those areas and you'll often find that after a few tries you can figure even those parts out.
Good luck!
You may want to consider looking at source code reverse engineering tools. There are two tools that I know of:
SWAG Kit (Linux only) link
Bauhaus academic commercial
Both tools offer similar feature sets that include static analysis that produces graphs of the relations between modules in the software.
This mostly consists of call graphs and type/class decencies. Viewing this information should give you a good picture of how the parts of the code relate to one another. Using this information, you can dig into the actual source for the parts that you are most interested in and that you need to understand/modify first.
I find that just jumping in to code can be a a bit overwhelming. Try to read as much documentation on the design as possible. This will hopefully explain the purpose and structure of each component. Its best if an existing developer can take you through it but that isn't always possible.
Once you are comfortable with the high level structure of the code, try to fix a bug or two. this will help you get to grips with the actual code.
I like all the answers that say you should use a tool like Doxygen to get a class diagram, and first try to understand the big picture. I totally agree with this.
That said, this largely depends on how well factored the code is to begin with. If its a gigantic mess, it's going to be hard to learn. If its clean, and organized properly, it shouldn't be that bad.
See this answer on how to use test coverage tools to locate the code for a feature of interest, without knowing anything about where that feature is, or how it is spread across many modules.
(shameless marketing ahead)
You should check out nWire. It is an Eclipse plugin for navigating and visualizing large codebases. Many of our customers use it to break-in new developers by printing out visualizations of the major flows.
Stacker Nobody asked about the most shocking thing new programmers find as they enter the field.
Very high on the list, is the impact of inheriting a codebase with which one must rapidly become acquainted. It can be quite a shock to suddenly find yourself charged with maintaining N lines of code that has been clobbered together for who knows how long, and to have a short time in which to start contributing to it.
How do you efficiently absorb all this new data? What eases this transition? Is the only real solution to have already contributed to enough open-source projects that the shock wears off?
This also applies to veteran programmers. What techniques do you use to ease the transition into a new codebase?
I added the Community-Building tag to this because I'd also like to hear some war-stories about these transitions. Feel free to share how you handled a particularly stressful learning curve.
Pencil & Notebook ( don't get distracted trying to create a unrequested solution)
Make notes as you go and take an hour every monday to read thru and arrange the notes from previous weeks
with large codebases first impressions can be deceiving and issues tend to rearrange themselves rapidly while you are familiarizing yourself.
Remember the issues from your last work environment aren't necessarily valid or germane in your new environment. Beware of preconceived notions.
The notes/observations you make will help you learn quickly what questions to ask and of whom.
Hopefully you've been gathering the names of all the official (and unofficial) stakeholders.
One of the best ways to familiarize yourself with inherited code is to get your hands dirty. Start with fixing a few simple bugs and work your way into more complex ones. That will warm you up to the code better than trying to systematically review the code.
If there's a requirements or functional specification document (which is hopefully up-to-date), you must read it.
If there's a high-level or detailed design document (which is hopefully up-to-date), you probably should read it.
Another good way is to arrange a "transfer of information" session with the people who are familiar with the code, where they provide a presentation of the high level design and also do a walk-through of important/tricky parts of the code.
Write unit tests. You'll find the warts quicker, and you'll be more confident when the time comes to change the code.
Try to understand the business logic behind the code. Once you know why the code was written in the first place and what it is supposed to do, you can start reading through it, or as someone said, prolly fixing a few bugs here and there
My steps would be:
1.) Setup a source insight( or any good source code browser you use) workspace/project with all the source, header files, in the code base. Browsly at a higher level from the top most function(main) to lowermost function. During this code browsing, keep making notes on a paper/or a word document tracing the flow of the function calls. Do not get into function implementation nitti-gritties in this step, keep that for a later iterations. In this step keep track of what arguments are passed on to functions, return values, how the arguments that are passed to functions are initialized how the value of those arguments set modified, how the return values are used ?
2.) After one iteration of step 1.) after which you have some level of code and data structures used in the code base, setup a MSVC (or any other relevant compiler project according to the programming language of the code base), compile the code, execute with a valid test case, and single step through the code again from main till the last level of function. In between the function calls keep moting the values of variables passed, returned, various code paths taken, various code paths avoided, etc.
3.) Keep repeating 1.) and 2.) in iteratively till you are comfortable up to a point that you can change some code/add some code/find a bug in exisitng code/fix the bug!
-AD
I don't know about this being "the best way", but something I did at a recent job was to write a code spider/parser (in Ruby) that went through and built a call tree (and a reverse call tree) which I could later query. This was slightly non-trivial because we had PHP which called Perl which called SQL functions/procedures. Any other code-crawling tools would help in a similar fashion (i.e. javadoc, rdoc, perldoc, Doxygen etc.).
Reading any unit tests or specs can be quite enlightening.
Documenting things helps (either for yourself, or for other teammates, current and future). Read any existing documentation.
Of course, don't underestimate the power of simply asking a fellow teammate (or your boss!) questions. Early on, I asked as often as necessary "do we have a function/script/foo that does X?"
Go over the core libraries and read the function declarations. If it's C/C++, this means only the headers. Document whatever you don't understand.
The last time I did this, one of the comments I inserted was "This class is never used".
Do try to understand the code by fixing bugs in it. Do correct or maintain documentation. Don't modify comments in the code itself, that risks introducing new bugs.
In our line of work, generally speaking we do no changes to production code without good reason. This includes cosmetic changes; even these can introduce bugs.
No matter how disgusting a section of code seems, don't be tempted to rewrite it unless you have a bugfix or other change to do. If you spot a bug (or possible bug) when reading the code trying to learn it, record the bug for later triage, but don't attempt to fix it.
Another Procedure...
After reading Andy Hunt's "Pragmatic Thinking and Learning - Refactor Your Wetware" (which doesn't address this directly), I picked up a few tips that may be worth mentioning:
Observe Behavior:
If there's a UI, all the better. Use the app and get a mental map of relationships (e.g. links, modals, etc). Look at HTTP request if it helps, but don't put too much emphasis on it -- you just want a light, friendly acquaintance with app.
Acknowledge the Folder Structure:
Once again, this is light. Just see what belongs where, and hope that the structure is semantic enough -- you can always get some top-level information from here.
Analyze Call-Stacks, Top-Down:
Go through and list on paper or some other medium, but try not to type it -- this gets different parts of your brain engaged (build it out of Legos if you have to) -- function-calls, Objects, and variables that are closest to top-level first. Look at constants and modules, make sure you don't dive into fine-grained features if you can help it.
MindMap It!:
Maybe the most important step. Create a very rough draft mapping of your current understanding of the code. Make sure you run through the mindmap quickly. This allows an even spread of different parts of your brain to (mostly R-Mode) to have a say in the map.
Create clouds, boxes, etc. Wherever you initially think they should go on the paper. Feel free to denote boxes with syntactic symbols (e.g. 'F'-Function, 'f'-closure, 'C'-Constant, 'V'-Global Var, 'v'-low-level var, etc). Use arrows: Incoming array for arguments, Outgoing for returns, or what comes more naturally to you.
Start drawing connections to denote relationships. Its ok if it looks messy - this is a first draft.
Make a quick rough revision. Its its too hard to read, do another quick organization of it, but don't do more than one revision.
Open the Debugger:
Validate or invalidate any notions you had after the mapping. Track variables, arguments, returns, etc.
Track HTTP requests etc to get an idea of where the data is coming from. Look at the headers themselves but don't dive into the details of the request body.
MindMap Again!:
Now you should have a decent idea of most of the top-level functionality.
Create a new MindMap that has anything you missed in the first one. You can take more time with this one and even add some relatively small details -- but don't be afraid of what previous notions they may conflict with.
Compare this map with your last one and eliminate any question you had before, jot down new questions, and jot down conflicting perspectives.
Revise this map if its too hazy. Revise as much as you want, but keep revisions to a minimum.
Pretend Its Not Code:
If you can put it into mechanical terms, do so. The most important part of this is to come up with a metaphor for the app's behavior and/or smaller parts of the code. Think of ridiculous things, seriously. If it was an animal, a monster, a star, a robot. What kind would it be. If it was in Star Trek, what would they use it for. Think of many things to weigh it against.
Synthesis over Analysis:
Now you want to see not 'what' but 'how'. Any low-level parts that through you for a loop could be taken out and put into a sterile environment (you control its inputs). What sort of outputs are you getting. Is the system more complex than you originally thought? Simpler? Does it need improvements?
Contribute Something, Dude!:
Write a test, fix a bug, comment it, abstract it. You should have enough ability to start making minor contributions and FAILING IS OK :)! Note on any changes you made in commits, chat, email. If you did something dastardly, you guys can catch it before it goes to production -- if something is wrong, its a great way to get a teammate to clear things up for you. Usually listening to a teammate talk will clear a lot up that made your MindMaps clash.
In a nutshell, the most important thing to do is use a top-down fashion of getting as many different parts of your brain engaged as possible. It may even help to close your laptop and face your seat out the window if possible. Studies have shown that enforcing a deadline creates a "Pressure Hangover" for ~2.5 days after the deadline, which is why deadlines are often best to have on a Friday. So, BE RELAXED, THERE'S NO TIMECRUNCH, AND NOW PROVIDE YOURSELF WITH AN ENVIRONMENT THAT'S SAFE TO FAIL IN. Most of this can be fairly rushed through until you get down to details. Make sure that you don't bypass understanding of high-level topics.
Hope this helps you as well :)
All really good answers here. Just wanted to add few more things:
One can pair architectural understanding with flash cards and re-visiting those can solidify understanding. I find questions such as "Which part of code does X functionality ?", where X could be a useful functionality in your code base.
I also like to open a buffer in emacs and start re-writing some parts of the code base that I want to familiarize myself with and add my own comments etc.
One thing vi and emacs users can do is use tags. Tags are contained in a file ( usually called TAGS ). You generate one or more tags files by a command ( etags for emacs vtags for vi ). Then we you edit source code and you see a confusing function or variable you load the tags file and it will take you to where the function is declared ( not perfect by good enough ). I've actually written some macros that let you navigate source using Alt-cursor,
sort of like popd and pushd in many flavors of UNIX.
BubbaT
The first thing I do before going down into code is to use the application (as several different users, if necessary) to understand all the functionalities and see how they connect (how information flows inside the application).
After that I examine the framework in which the application was built, so that I can make a direct relationship between all the interfaces I have just seen with some View or UI code.
Then I look at the database and any database commands handling layer (if applicable), to understand how that information (which users manipulate) is stored and how it goes to and comes from the application
Finally, after learning where data comes from and how it is displayed I look at the business logic layer to see how data gets transformed.
I believe every application architecture can de divided like this and knowning the overall function (a who is who in your application) might be beneficial before really debugging it or adding new stuff - that is, if you have enough time to do so.
And yes, it also helps a lot to talk with someone who developed the current version of the software. However, if he/she is going to leave the company soon, keep a note on his/her wish list (what they wanted to do for the project but were unable to because of budget contraints).
create documentation for each thing you figured out from the codebase.
find out how it works by exprimentation - changing a few lines here and there and see what happens.
use geany as it speeds up the searching of commonly used variables and functions in the program and adds it to autocomplete.
find out if you can contact the orignal developers of the code base, through facebook or through googling for them.
find out the original purpose of the code and see if the code still fits that purpose or should be rewritten from scratch, in fulfillment of the intended purpose.
find out what frameworks did the code use, what editors did they use to produce the code.
the easiest way to deduce how a code works is by actually replicating how a certain part would have been done by you and rechecking the code if there is such a part.
it's reverse engineering - figuring out something by just trying to reengineer the solution.
most computer programmers have experience in coding, and there are certain patterns that you could look up if that's present in the code.
there are two types of code, object oriented and structurally oriented.
if you know how to do both, you're good to go, but if you aren't familiar with one or the other, you'd have to relearn how to program in that fashion to understand why it was coded that way.
in objected oriented code, you can easily create diagrams documenting the behaviors and methods of each object class.
if it's structurally oriented, meaning by function, create a functions list documenting what each function does and where it appears in the code..
i haven't done either of the above myself, as i'm a web developer it is relatively easy to figure out starting from index.php to the rest of the other pages how something works.
goodluck.