Can a Sequence Diagram realistically capture your logic in the same depth as code? - language-agnostic

I use UML Sequence Diagrams all the time, and am familiar with the UML2 notation.
But I only ever use them to capture the essence of what I intend to do. In other words the diagram always exists at a level of abstraction above the actual code. Every time I use them to try and describe exactly what I intend to do I end up using so much horizontal space and so many alt/loop frames that its not worth the effort.
So it may be possible in theory but has anyone every really used the diagram in this level of detail? If so can you provide an example please?

I have the same problem but when I realize that I am going low-level I re-read this:
You should use sequence diagrams
when you want to look at the behavior
of several objects within a single use
case. Sequence diagrams are good at
showing collaborations among the
objects; they are not so good at
precise definition of the behavior.
If you want to look at the behavior of
a single object across many use cases,
use a state diagram. If you want
to look at behavior across many use
cases or many threads, consider an
activity diagram.
If you want to explore multiple
alternative interactions quickly, you
may be better off with CRC cards,
as that avoids a lot of drawing and
erasing. It’s often handy to have a
CRC card session to explore design
alternatives and then use sequence
diagrams to capture any interactions
that you want to refer to later.
[excerpt from Martin Fowler's UML Distilled book]

It's all relative. The law of diminishing returns always applies when making a diagram. I think it's good to show the interaction between objects (objectA initializes objectB and calls method foo on it). But it's not practical to show the internals of a function. In that regard, a sequence diagram is not practical to capture the logic at the same depth as code. I would argue for intricate logic, you'd want to use a flowchart.

I think there are two issues to consider.
Be concrete
Sequence diagrams are at their best when they are used to convey to a single concrete scenario (of a use case for example).
When you use them to depict more than one scenario, usually to show what happens in every possible path through a use case, they get complicated very quickly.
Since source code is just like a use case in this regard (i.e. a general description instead of a specific one), sequence diagrams aren't a good fit. Imagine expanding x levels of the call graph of some method and showing all that information on a single diagram, including all if & loop conditions..
That's why 'capturing the essence' as you put it, is so important.
Ideally a sequence diagram fits on a single A4/Letter page, anything larger makes the diagram unwieldy. Perhaps as a rule of thumb, limit the number of objects to 6-10 and the number of calls to 10-25.
Focus on communication
Sequence diagrams are meant to highlight communication, not internal processing.
They're very expressive when it comes to specifying the communication that happens (involved parties, asynchronous, synchronous, immediate, delayed, signal, call, etc.) but not when it comes to internal processing (only actions really)
Also, although you can use variables it's far from perfect. The objects at the top are, well, objects. You could consider them as variables (i.e. use their names as variables) but it just isn't very convenient.
For example, try depicting the traversal of a linked list where you need to keep tabs on an element and its predecessor with a sequence diagram. You could use two 'variable' objects called 'current' and 'previous' and add the necessary actions to make current=current.next and previous=current but the result is just awkward.

Personally I have used sequence diagrams only as a description of general interaction between different objects, i.e. as a quick "temporal interaction sketch". When I tried to get more in depth, all quickly started to be confused...
I've found that the best compromise is a "simplified" sequence diagram followed by a clear but in depth description of the logic underneath.

The answer is no - it does capture it better then your source code!
At least in some aspects. Let me elaborate.
You - like the majority of the programmers, including me - think in source code lines. But the software end product - let's call it the System - is much more than that. It only exists in the mind of your team members. In better cases it also exists on paper or in other documented forms.
There are plenty of standard 'views' to describe the System. Like UML Class diagrams, UML activity diagrams etc. Each diagram shows the System from another point of view. You have static views, dynamic views, but in an architectural/software document you don't have to stop there. You can present nonstandard views in your own words, e.g. deployment view, performance view, usability view, company-values view, boss's favourite things view etc.
Each view captures and documents certain properties of the System.
It's very important to realize that the source code is just one view. The most important though because it's needed to generate a computer program. But it doesn't contain every piece of information of your System, nor explicitly nor implicitly. (E.g. the shared data between program modules, what are only connected via offline user activity. No trace in the source). It's just a static view which helps very little to understand your processes, the runtime dynamics of your living-breathing program.
A classic example of the Observer pattern. Especially if it used heavily, you'll hardly understand the System mechanis from the source code. That's why you use Sequence diagrams in that case. It captures the 'dynamic logic' of your system a lot better than your source code.
But if you meant some kind of business logic in great detail, you are better off with plain text/source code/pseudocode etc. You don't have to use UML diagrams just because they are the standard. You can use usecase modeling without drawing usecase diagrams. Always choose the view what's the best for you and for your purpose.

U.M.L. diagrams are guidelines, not strictly rules.
You don't have to make them exactly & detailed as the source code, but, you may try it, if you want it.
Sometimes, its possible to do it, sometimes, its not possible, because of the detail or complexity of systems, or don't have the time or details to do it.
Cheers.
P.D.
Any cheese-burguer or tuna-fish-burguer for the cat ?

Related

How do I document my code to understand how it flows (works)?

I am writing a small game,and I now have 9 C# scripts that make it work. I have lost track of what exactly is happening and how. I want to know how things work from the moment the game starts. Whats happening and how, etc.
I am a beginner, and I have heard that writing down your program flow is called documenting it. How can I document? Do I have to write comments everywhere in my code to explain the flow of the program?
Putting extensive comments into your code is not a good approach. Basically you should try to make your code as self-explanatory as possible. You do this by carefully planning what belongs into a class or function and by using meaningful names for your classes, functions and variables. Comments are nothing but a last resort if additional explanation is really required.
In most cases you should also also have some documents in addition to the code that explain certain aspects of your software:
Requirements document - what is the purpose of the software, how is it used
Architecture and design specification - what are the modules and classes of the software and how do they interact. Often this document mainly consists of one or more diagrams (UML or something else).
Build manual - how to compile and link the software
Installation instructions
User manual
This list is neither complete nor is it mandatory. If, for example, the user interface of your software is simple and self-explaining, you probably won't need a user manual.
Sometimes diagrams make better documentation than text. There is a standard way of diagramming a control flow (whether it's of a program or a business process). They're called ... wait for it ... control-flow diagrams. But I don't think that's exactly what you're after.
There are also flow charts (often spelled as one word), which may be more suited to software than general control-flow diagrams. Flow charts can be useful for understanding an algorithm, but they generally don't give a good big-picture view.
With a complicated program, what might be more important to keep in mind is the data flow. For those we have ... can you guess? ... data-flow diagrams (DFDs).
DFDs can be drawn at varying levels of detail. You can have a high-level one that shows the major components of the system and how they fit together and low-level ones that show the nitty-gritty details for the portions of the system that require more detail.
DFDs can be used for a variety of analyses, including things like threat modeling. But I find them great for getting an overview of what's-what when I'm looking at a new project (or one I've forgotten about). You should be able to find some tutorials about DFDs online, and I think some drawing software (like Visio) have templates specifically for DFDs (and probably the other types of diagrams I've mentioned).
Some might consider DFDs a bit old-school and prefer more rigorous systems like UML (Unified Modeling Language), which is capable of expressing many more concepts and of having a very direct mapping between your "model" and your code. I've never learned enough UML to get much use out of it. The diagrams in many books on software patterns are expressed in UML.

Write programs that do one thing and do it well

I can grasp the part "do one thing" via encapsulation, Dependency Injection, Principle of Least Knowledge, and You Ain't Gonna Need It; but how do I understand the second part "do it well?"
An example given was the notion of completeness, given in the same YAGNI article:
for example, among features which allow adding items, deleting items, or modifying items, completeness could be used to also recommend "renaming items".
However, I found reasoning like that could easily be abused into feature creep, thus violating the "do one thing" part.
So, what is a litmus test for seeing rather a feature belongs to the "do it well" category (hence, include it into the function/class/program) or to the other "do one thing" category (hence, exclude it)?
The first part, "do one thing," is best understood via UNIX's ls command as a counterexample for its inclusion of excessive number of flags for formatting its output, which should have been completely delegated to another external program. But I don't have a good example to see the second part "do it well."
What is a good example where removing any further feature would make it not "do it well?"
I see "Do It Well" as being as much about quality of implementation of a function than about the completeness of a set functions (in your example having rename, as well as create and delete).
Do It Well manifests in many ways, some ways of thinking:
Behaviour in response to "special" inputs. Example, calculating the mean of some integers:
int mean(int[] values) { ... }
what does this do if the array has zero elements? If the items total more than MAX_INT?
Performance Characteristics. Has sufficient attention been given to behaviour as the data volumes increase?
Dependency Failures. If our implementation depends upon other modules or infrastructure what happens when these fail. Example: File System Full, Database Down?
Concerning feature creep itself, I think you're correct to indentify a tension here. One thing you might consider: you don't need to implment every feature providing that it's pretty obvious that a feature can be added easily without a complete rewrite.
The whole purpose of this advice is to make you favor quality over quantity.
The concept of one thing is subjective and depends on granularity. Would you say that a spreadsheet application does more than one thing if it can also print, or is that part of that one thing?
The point is that you should make sure that any feature, and the application itself, is done and will delight customers before you scramble to add new features.
I think your question points out the fundamentally organic nature of feature creep, and in understanding that nature, you will be empowered to meditate on the larger question.
Think of it like a garden: If you plant one thing and plant it well, say, a chrysanthemum, you aren't done at simply planting the seed. In fact you'll need to ensure that the soil is well tended, that the area is sufficiently protected, that the season is right, etc.
As your chrysanthemum (your one thing) grows, so too will other competitive plants - some that need to be weeded out and others that may actually compliment the original one thing. In fact, these other organisms may in some cases prove vital for the survival of your one thing.
Like those features that YAGN, a bit of vigilance is required to determine which weeds represent feature creep and which represent vital and complimentary functions.
Regardless, having done it well means simply that your chrysanthemum is hearty, healthy, and on-time. :-)
I would say an email program without the ability to add attachments would be a good example.
This may sound like an odd example, but I'd say dropbox is a good, albeit complex example.
Its managed to beat off a swathe of similar competing apps, through a dedication to simplification and a lack of feature creep tha,t as you mentioned, would violate the 'do one thing' principle. The ap lets you store documents in a folder that you can access anywhere, and that's about the limit of it. They drilled down to the core problem, and solved it in a way that works perfectly well in 90+% of cases.
Its hard to put a hard and fast rule to it, but I'd say that catering to around the 90% majority of use cases and ignoring 'fringe requirements' is the best way to stick to this rule.
I'd guess 90+% of ls use is with no arguments or maybe two or three of the most popular. The 'do it well' principle should focus on what the majority of users need, instead of catering for power users or fringe cases, as ls does with its plethora of options.
This is what dropbox does successfully and why it is pretty well agreed upon as an example of good application design.

When to make class and function?

I am a beginner to programming when I start to code I just start writing and solve the problem.
I write whole program in a single main function.
I don't know when to make class, and functions.
What are good books which I read to learn these concepts?
A very general question, so just a few rules of thumb:
code reuse: when you have the same or very similar piece of code in two places, it should be moved to a function
readibility: if a function spans more than a single page on screen, you may want to break it apart into several functions
focus: every class or function should do only one specific task. Everything that is not core to this purpose should be delegated to other classes/functions.
I think the canonical answer here is that you should organize your code so that it's readable and maintainable. With that said, it's also important to consider the cost of organizing your code, and how long you expect your code to live.
More directly in response to your question: functions should be used to replace repetitive or otherwise well contained pieces of code. If you apply the same 10 operations over and over again on the same kinds of elements/data you might want to think about collecting all that information into a more concise and clear function. In general, a function needs well defined inputs and outputs.
Classes, in essence, collect functions and data together. Much like you should use a function to collect operations into concise, well defined collections of operations, classes should organize functions and data relevant to be stored together. That is, if you have a bunch of functions that operate on some things like a steering wheel, brakes, accelerators, etc. you should think about having a Vehicle class to organize these relevant functions and data/objects.
Beyond acting as an organizational element, classes should be used to enable easy reuse and creation of multiple "things" - suppose you wanted a collection of those Vehicles. Classes allow you to tie meaning or at least some semantics to your program.
The point of all this, though, is to make your life and the lives of others easier hen it comes to authoring and maintaining your program. So, by all means, when you need a solution to a problem in less than ten minutes and you think it's a one-time use program, ignore all this if you think it'll let you accomplish what you need to faster. Bear in mind, all this organization, semantics and ease of repetitve operation exists to make it easier to accomplish your objectives.
This is a stylistic and preference question and depending on how formal a place you work at it could be a matter of standards. I follow a couple of rules.
Classes
Sets of related data belong in classes together
Functions to operate on that data should be in the classes together
The classic Example is the Car class functions would be things like Drive and AddGass
Functions
If you are going to use it more then once it should be in a function
Most functions should be no more then one screen of code
Functions Should do one thing well not a bunch of things poorly
there Are a ton of opinions, but over time you must develop your own style.
It's actually very simple nicky!
The purpose of splitting code into methods is simply to allow its reuse. When you create a method you allow your program to invoke it at any time from several places instead of repeating the code again and again.
So every time you write lines and think... 'hey, I might need this functionality again somewhere in my program', then you need to put it in a method.
As for classes, you will try to group similar functionalities together. And try to keep classes short and simple. If you need several classes, you'll also group them in packages and so on.
When I write code, I usually have a pretty good idea what I'll be using again. But often I will start to write a few lines of code and realize that I wrote something quite similar in the past. So I'll find it and put it in a method then the two or more locations can now just invoke it. That is reuse at its best!
You can often use analyzers to find various metrics which will "put a grade" on your reuse and code duplication.
Happy learning!
Have a look at
Procedure, subroutine or function?, Object-oriented programming
An object is actually a discrete
bundle of functions and procedures,
all relating to a particular
real-world concept such as a bank
account holder or hockey player in a
computer game. Other pieces of
software can access the object only by
calling its functions and procedures
that have been allowed to be called by
outsiders.
Basically, you use a function/procedure/method to encapsulate a specific section of code that does a specific job, or for reusibility.
Classes are used to encapsulate/represent an object with possibly its own data, and specific function/procedure/method that makes sense to use with this object.
In some languages classes can be made static, with static function/procedure/method which can then be used as helper function/procedure/method
Just FYI, it'll become more evident when and why functions are useful as you progress to larger projects. I was a bit confused by their use when I first started too, when your entire program is only 20-50 lines of code which follows a very linear path, they don't make much sense. But when you start re-using tidbits of code, it makes sense to throw it in functions. Also makes it easier to read and follow the logic of your program if you only have to read descriptive function names, rather than deciphering what the next 5 lines of code are supposed to do.
I found myself asking this same question, and it led me to this post.
I think that one of the most confusing things about how OOP is explained to beginners is the idea that classes represent exactly what they sound like: classes of things, like Computer, Dog, Car, etc.
This is fine as far as it goes, but it's not strictly true, and the reality is much more abstract. Sometimes, classes don't really represent anything that could be considered a clearly defined abstraction of a group of things. Sometimes, they just organize stuff.
For this reason, I think "class" is really a misnomer, or at least misleading. A more relatable way to think about what a class is might be to simply think of it as a "group" or a "logical grouping."

At what point should architecture become layered?

Obviously, "Hello World" doesn't require a separated, modular front-end and back-end. But any sort of Enterprise-grade project does.
Assuming some sort of spectrum between these points, at which stage should an application be (conceptually, or at a design level) multi-layered? When a database, or some external resource is introduced? When you find that the you're anticipating spaghetti code in your methods/functions?
when a database, or some external resource is introduced.
but also:
always (except for the most trivial of apps) separate AT LEAST presentation tier and application tier
see:
http://en.wikipedia.org/wiki/Multitier_architecture
Layers are a mean to keep a design loosely coupled and highly cohesive.
When you start to have a few classes (either implemented or just sketched with UML), they can be grouped logically, into layers - or more generally packages, or modules. This is called the art of separating the concerns.
The sooner the better: if you do not start layering early enough, then you risk to have never do it as the effort can be too important.
Here are some criteria of when to...
Any time you anticipate the need to
replace one part of it with a
different part.
Any time you find
yourself need to divide work amongst
parallel team.
There is no real answer to this question. It depends largely on your application's needs, and numerous other factors. I'd suggest reading some books on design patterns and enterprise application architecture. These two are invaluable:
Design Patterns: Elements of Reusable Object-Oriented Software
Patterns of Enterprise Application Architecture
Some other books that I highly recommend are:
The Pragmatic Programmer: From Journeyman to Master
Refactoring: Improving the Design of Existing Code
No matter your skill level, reading these will really open your eyes to a world of possibilities.
I'd say in most cases dealing with multiple distinct levels of abstraction in the concepts your code deals with would be a strong signal to mirror this with levels of abstraction in your implementation.
This does not override the scenarios that others have highlighted already though.
I think once you ask yourself "hmm should I layer this" the answer is yes.
I've worked on too many projects that probably started off as proof of concept/prototype that ended up being full projects used in production, which are horribly written and just wreak of "get it done quick, we'll fix it later." Trust me, you wont fix it later.
The Pragmatic Programmer lists this as the Broken Window Theory.
Try and always do it right from the start. Separate your concerns. Build it with modularity in mind.
And of course try and think of the poor maintenance programmer who might take over when you're done!
Thinking of it in terms of layers is a little limiting. It's what you see in whitepapers about a product, but it's not how products really work. They have "boxes" that depend on each other in various ways, and you can make it look like they fit into layers but you can do this in several different configurations, depending on what information you're leaving out of the diagram.
And in a really well-designed application, the boxes get very small. They are down to the level of individual interfaces and classes.
This is important because whenever you change a line of code, you need to have some understanding of the impact your change will have, which means you have to understand exactly what the code currently does, what its responsibilities are, which means it has to be a small chunk that has a single responsibility, implementing an interface that doesn't cause clients to be dependent on things they don't need (the S and the I of SOLID).
You may find that your application can look like it has two or three simple layers, if you narrow your eyes, but it may not. That isn't really a problem. Of course, a disastrously badly designed application can look like it has layers tiers if you squint as hard as you can. So those "high level" diagrams of an "architecture" can hide a multitude of sins.
My generic rule of thumb is to at least to separate the problem into a model and view layer, and throw in a controller if there is a possibility of more than one ways of handling the model or piping data to the view.
(Or as the first answer, at least the presentation tier and the application tier).
Loose coupling is all about minimising dependencies, so I would say 'layer' when a dependency is introduced. i.e. a database, third party application, etc.
Although 'layer' is probably the wrong term these days. Most of the time I use Dependency Injection (DI) through an Inversion of Control container such as Castle Windsor. This means that I can code on one part of my system without worrying about the rest. It has the side effect of ensuring loose coupling.
I would recommend DI as a general programming principle all of the time so that you have the choice on how to 'layer' your application later.
Give it a look.
R

The best way to familiarize yourself with an inherited codebase

Stacker Nobody asked about the most shocking thing new programmers find as they enter the field.
Very high on the list, is the impact of inheriting a codebase with which one must rapidly become acquainted. It can be quite a shock to suddenly find yourself charged with maintaining N lines of code that has been clobbered together for who knows how long, and to have a short time in which to start contributing to it.
How do you efficiently absorb all this new data? What eases this transition? Is the only real solution to have already contributed to enough open-source projects that the shock wears off?
This also applies to veteran programmers. What techniques do you use to ease the transition into a new codebase?
I added the Community-Building tag to this because I'd also like to hear some war-stories about these transitions. Feel free to share how you handled a particularly stressful learning curve.
Pencil & Notebook ( don't get distracted trying to create a unrequested solution)
Make notes as you go and take an hour every monday to read thru and arrange the notes from previous weeks
with large codebases first impressions can be deceiving and issues tend to rearrange themselves rapidly while you are familiarizing yourself.
Remember the issues from your last work environment aren't necessarily valid or germane in your new environment. Beware of preconceived notions.
The notes/observations you make will help you learn quickly what questions to ask and of whom.
Hopefully you've been gathering the names of all the official (and unofficial) stakeholders.
One of the best ways to familiarize yourself with inherited code is to get your hands dirty. Start with fixing a few simple bugs and work your way into more complex ones. That will warm you up to the code better than trying to systematically review the code.
If there's a requirements or functional specification document (which is hopefully up-to-date), you must read it.
If there's a high-level or detailed design document (which is hopefully up-to-date), you probably should read it.
Another good way is to arrange a "transfer of information" session with the people who are familiar with the code, where they provide a presentation of the high level design and also do a walk-through of important/tricky parts of the code.
Write unit tests. You'll find the warts quicker, and you'll be more confident when the time comes to change the code.
Try to understand the business logic behind the code. Once you know why the code was written in the first place and what it is supposed to do, you can start reading through it, or as someone said, prolly fixing a few bugs here and there
My steps would be:
1.) Setup a source insight( or any good source code browser you use) workspace/project with all the source, header files, in the code base. Browsly at a higher level from the top most function(main) to lowermost function. During this code browsing, keep making notes on a paper/or a word document tracing the flow of the function calls. Do not get into function implementation nitti-gritties in this step, keep that for a later iterations. In this step keep track of what arguments are passed on to functions, return values, how the arguments that are passed to functions are initialized how the value of those arguments set modified, how the return values are used ?
2.) After one iteration of step 1.) after which you have some level of code and data structures used in the code base, setup a MSVC (or any other relevant compiler project according to the programming language of the code base), compile the code, execute with a valid test case, and single step through the code again from main till the last level of function. In between the function calls keep moting the values of variables passed, returned, various code paths taken, various code paths avoided, etc.
3.) Keep repeating 1.) and 2.) in iteratively till you are comfortable up to a point that you can change some code/add some code/find a bug in exisitng code/fix the bug!
-AD
I don't know about this being "the best way", but something I did at a recent job was to write a code spider/parser (in Ruby) that went through and built a call tree (and a reverse call tree) which I could later query. This was slightly non-trivial because we had PHP which called Perl which called SQL functions/procedures. Any other code-crawling tools would help in a similar fashion (i.e. javadoc, rdoc, perldoc, Doxygen etc.).
Reading any unit tests or specs can be quite enlightening.
Documenting things helps (either for yourself, or for other teammates, current and future). Read any existing documentation.
Of course, don't underestimate the power of simply asking a fellow teammate (or your boss!) questions. Early on, I asked as often as necessary "do we have a function/script/foo that does X?"
Go over the core libraries and read the function declarations. If it's C/C++, this means only the headers. Document whatever you don't understand.
The last time I did this, one of the comments I inserted was "This class is never used".
Do try to understand the code by fixing bugs in it. Do correct or maintain documentation. Don't modify comments in the code itself, that risks introducing new bugs.
In our line of work, generally speaking we do no changes to production code without good reason. This includes cosmetic changes; even these can introduce bugs.
No matter how disgusting a section of code seems, don't be tempted to rewrite it unless you have a bugfix or other change to do. If you spot a bug (or possible bug) when reading the code trying to learn it, record the bug for later triage, but don't attempt to fix it.
Another Procedure...
After reading Andy Hunt's "Pragmatic Thinking and Learning - Refactor Your Wetware" (which doesn't address this directly), I picked up a few tips that may be worth mentioning:
Observe Behavior:
If there's a UI, all the better. Use the app and get a mental map of relationships (e.g. links, modals, etc). Look at HTTP request if it helps, but don't put too much emphasis on it -- you just want a light, friendly acquaintance with app.
Acknowledge the Folder Structure:
Once again, this is light. Just see what belongs where, and hope that the structure is semantic enough -- you can always get some top-level information from here.
Analyze Call-Stacks, Top-Down:
Go through and list on paper or some other medium, but try not to type it -- this gets different parts of your brain engaged (build it out of Legos if you have to) -- function-calls, Objects, and variables that are closest to top-level first. Look at constants and modules, make sure you don't dive into fine-grained features if you can help it.
MindMap It!:
Maybe the most important step. Create a very rough draft mapping of your current understanding of the code. Make sure you run through the mindmap quickly. This allows an even spread of different parts of your brain to (mostly R-Mode) to have a say in the map.
Create clouds, boxes, etc. Wherever you initially think they should go on the paper. Feel free to denote boxes with syntactic symbols (e.g. 'F'-Function, 'f'-closure, 'C'-Constant, 'V'-Global Var, 'v'-low-level var, etc). Use arrows: Incoming array for arguments, Outgoing for returns, or what comes more naturally to you.
Start drawing connections to denote relationships. Its ok if it looks messy - this is a first draft.
Make a quick rough revision. Its its too hard to read, do another quick organization of it, but don't do more than one revision.
Open the Debugger:
Validate or invalidate any notions you had after the mapping. Track variables, arguments, returns, etc.
Track HTTP requests etc to get an idea of where the data is coming from. Look at the headers themselves but don't dive into the details of the request body.
MindMap Again!:
Now you should have a decent idea of most of the top-level functionality.
Create a new MindMap that has anything you missed in the first one. You can take more time with this one and even add some relatively small details -- but don't be afraid of what previous notions they may conflict with.
Compare this map with your last one and eliminate any question you had before, jot down new questions, and jot down conflicting perspectives.
Revise this map if its too hazy. Revise as much as you want, but keep revisions to a minimum.
Pretend Its Not Code:
If you can put it into mechanical terms, do so. The most important part of this is to come up with a metaphor for the app's behavior and/or smaller parts of the code. Think of ridiculous things, seriously. If it was an animal, a monster, a star, a robot. What kind would it be. If it was in Star Trek, what would they use it for. Think of many things to weigh it against.
Synthesis over Analysis:
Now you want to see not 'what' but 'how'. Any low-level parts that through you for a loop could be taken out and put into a sterile environment (you control its inputs). What sort of outputs are you getting. Is the system more complex than you originally thought? Simpler? Does it need improvements?
Contribute Something, Dude!:
Write a test, fix a bug, comment it, abstract it. You should have enough ability to start making minor contributions and FAILING IS OK :)! Note on any changes you made in commits, chat, email. If you did something dastardly, you guys can catch it before it goes to production -- if something is wrong, its a great way to get a teammate to clear things up for you. Usually listening to a teammate talk will clear a lot up that made your MindMaps clash.
In a nutshell, the most important thing to do is use a top-down fashion of getting as many different parts of your brain engaged as possible. It may even help to close your laptop and face your seat out the window if possible. Studies have shown that enforcing a deadline creates a "Pressure Hangover" for ~2.5 days after the deadline, which is why deadlines are often best to have on a Friday. So, BE RELAXED, THERE'S NO TIMECRUNCH, AND NOW PROVIDE YOURSELF WITH AN ENVIRONMENT THAT'S SAFE TO FAIL IN. Most of this can be fairly rushed through until you get down to details. Make sure that you don't bypass understanding of high-level topics.
Hope this helps you as well :)
All really good answers here. Just wanted to add few more things:
One can pair architectural understanding with flash cards and re-visiting those can solidify understanding. I find questions such as "Which part of code does X functionality ?", where X could be a useful functionality in your code base.
I also like to open a buffer in emacs and start re-writing some parts of the code base that I want to familiarize myself with and add my own comments etc.
One thing vi and emacs users can do is use tags. Tags are contained in a file ( usually called TAGS ). You generate one or more tags files by a command ( etags for emacs vtags for vi ). Then we you edit source code and you see a confusing function or variable you load the tags file and it will take you to where the function is declared ( not perfect by good enough ). I've actually written some macros that let you navigate source using Alt-cursor,
sort of like popd and pushd in many flavors of UNIX.
BubbaT
The first thing I do before going down into code is to use the application (as several different users, if necessary) to understand all the functionalities and see how they connect (how information flows inside the application).
After that I examine the framework in which the application was built, so that I can make a direct relationship between all the interfaces I have just seen with some View or UI code.
Then I look at the database and any database commands handling layer (if applicable), to understand how that information (which users manipulate) is stored and how it goes to and comes from the application
Finally, after learning where data comes from and how it is displayed I look at the business logic layer to see how data gets transformed.
I believe every application architecture can de divided like this and knowning the overall function (a who is who in your application) might be beneficial before really debugging it or adding new stuff - that is, if you have enough time to do so.
And yes, it also helps a lot to talk with someone who developed the current version of the software. However, if he/she is going to leave the company soon, keep a note on his/her wish list (what they wanted to do for the project but were unable to because of budget contraints).
create documentation for each thing you figured out from the codebase.
find out how it works by exprimentation - changing a few lines here and there and see what happens.
use geany as it speeds up the searching of commonly used variables and functions in the program and adds it to autocomplete.
find out if you can contact the orignal developers of the code base, through facebook or through googling for them.
find out the original purpose of the code and see if the code still fits that purpose or should be rewritten from scratch, in fulfillment of the intended purpose.
find out what frameworks did the code use, what editors did they use to produce the code.
the easiest way to deduce how a code works is by actually replicating how a certain part would have been done by you and rechecking the code if there is such a part.
it's reverse engineering - figuring out something by just trying to reengineer the solution.
most computer programmers have experience in coding, and there are certain patterns that you could look up if that's present in the code.
there are two types of code, object oriented and structurally oriented.
if you know how to do both, you're good to go, but if you aren't familiar with one or the other, you'd have to relearn how to program in that fashion to understand why it was coded that way.
in objected oriented code, you can easily create diagrams documenting the behaviors and methods of each object class.
if it's structurally oriented, meaning by function, create a functions list documenting what each function does and where it appears in the code..
i haven't done either of the above myself, as i'm a web developer it is relatively easy to figure out starting from index.php to the rest of the other pages how something works.
goodluck.