The point of Stryket.Net since.ignore-changes-in config - stryker

Stryker-net has an option since.ignore-changes-in and I'm trying to understand in which use case it may be useful to ignore non C# files 1.
The doc gives that example of value ['/*Assets.json','/favicon.ico'], but if my last commit changes only Assets.json and if I run stryker with "since": {"target": "<sha1 of my previous commit>"}, then no mutation will be found on this change, with or without this ignore-changes-in option, right?
What am I missing?
1: Sure, if I don't find it useful I could just avoid using it on my personal projects. However I'm responsible for providing stryker in CI for several teams in my company and I'm hence interested in making sure I have a good understanding of the consequences of using (or not) this.

I can think of two cases.
First, it might bring some performance gain (just by reducing the amount of files Stryker will get to), e.g. when your change updates 100 XML localizations of some string.
Second, you might have real code you don't want to run Stryker against. E.g. you have some Obsolete folder with a bunch of projects you don't care about but which automatically updates with some renaming and so on. This might be interchangeable with the mutate option, unless you want to have different filters for "normal" and "diff" Stryker executions.

Related

Why remove QA IDs from the code base in production

In a React application that we are developing we use QA IDs for Selenium tests.
Is it bad practice to leave them in the code base in production (live)?
If so, why? Is it only to try to keep the TTFB (time to first byte) low?
More context:
Util for automation tags that takes a given string and returns an object containing properties to be spread onto an element. These should only be used for testing purposes as they should only be available during a test run.
Eg.
const automationTags = (givenTag) =>
(IS_PROD || !givenTag) ? {} : { 'data-qa': _.kebabCase(givenTag) }
Usage:
<Component {...automationTags(`${dataQa}-button`)} />
<Component {...automationTags('profile-page-success-btn')} />
...where dataQa is the prop used within React components.
I can see a number of reasons for leaving the ID's in.
It reduces the complexity of the code. Rather than having all this logic all over the code base whether or not to include the ID's, the ID simply exists.
It may be easy to remove them, but when you multiply the number of ID's, that is a lot of code included for little to no benefit.
It increases the fidelity between the various environments. When you run the automation tests against your test environments, you can be fairly certain it will be the same code that is going to production and not another variation of it.
It allows you to run automation against the production environment. Rather than assuming it works after a deploy, you now know it works. You know what they say about assuming right?
It's not like the ID's can be abused by your customers. If your customers are inspecting the pages/code, simply knowing the ID's of the elements is not harmful.
Rather than having to include specific QA ID's, maybe it would be a good practice to simply use actual ID's, class names and regular CSS so you don't need specific QA ID's
All of the above more than weigh up for the risk of leaving them in. I wouldn't think any performance gain would be measurable.
Having said that, actually measuring it would be the best way to convince the developers they don't need to remove them. For instance, Chrome's Developer Console has a way to measure the page load, and where the resources are going.
If you go to Performance, you can record a page load and see what it takes to load a page.
Comparing production with Test environments should give you more information.

Change config values on a specific time

I just got a mail saying that I have to change a config value at 2009-09-01 (new taxes). Our normal approach for this would be to to awake at 2009-08-31 at 23:59 and then just change the value manually. Which not is a big problem since this don't happens to often. But it makes me wonder how other people handle issues like this.
So! How do you handle date specific config changes?
(We are working in asp.net but I don't think this has to be language specific)
Br
Carl Bergquist
I'd normally store this kind of data in a database table like this
Key, Value, EffectiveFrom, EffectiveTo
-----------------------------------------
VAT, 15.0, 20081201, 20091231
VAT, 17.5, 20100101, NULL
I'd then use the EffectiveFrom and EffectiveTo dates to chose the value that is effective at the given time. If the rate is open ended then the effecive to could either by NULL or 99991231.
This also allows you to go back without having to change the config. E.g. if someone asks you to recalculate the tax for the previous month before the rate change.
In linux, there is a command "at" for batch execution.
See "man at" for details.
To be honest, waking up near the time and changing it seems to be the simplest and cheapest approach. All of the technical solutions are fine, but it depends where you work.
In our environment it would be cheaper and simpler to get someone to wake up and make the change than to redevelop the functionality of a piece of software that already works. It certainly involves less testing, development overhead and costs which means we would tend to solve the problem as you do, manually.
That depends totally on the situation and the technology.
pjp's idea is good, if you get your config from a database, or as metadata to define the valid time for whole config sets/files.
Another might be: just prepare a new configfile with the new entries and swap them at midnight (probably with a restart of the service/program) whatever.
Swapping them would be possible with at (as given bei Neeraj) ...
If timing is a problem you should handle the change, or at least the timing of the change on the running server (to avoid time out of synch problems).
We got same kind of problem some time before and handled using the following approach.
this is suitable if you are well known to the source that orginates the configuration changes..
In our case, the source exposed a webservice (actualy a third party) which will return a modified config details. And there is a windows service running on our server which keeps on polling the webservice and will update the configuration file if there is any change.
this works perfectly in our case..
You can make use of this approach by changing the polling webservice part to your source of config change (say reading changes from some disk path). But am not sure how this is possible reading config changes from email.
Why not just make a shell script to swap out the files. run it in cron and switch the files out a minute before and send an alert text if NOT successful and an email if successful.
This is an example on a Linux box but I think you get the point and can do this on a Windows box.
Script:
cp /path/to/old/config /path/to/backup/dir/config.timestamp
cp /path/to/new/config
if(/path/to/new/config exsits) {
sendSuccessEmail();
} else {
sendPanicTextAlert();
}
cron:
59 23 31 8 * /path/to/script.sh
you could test this as well before hand just point to some dummy directories and file
I've seen the hybrid approach. Instead of actually changing the data model to include EffectiveDate/EndDate or manually changing the values yourself, schedule a script to change the values automatically. Also, be sure to have a solid test plan that will validate all changes.
However, this type of manual change can have a dramatic impact on reporting. If previous transactions join directly to the tables being changed, numbers in historical reports could change in a very bad way. There really is no "right" answer.
If I'm not able to do something like pjp's solution, I'd use either a scheduled task or a server job to update it automatically at the right time.
But...I'd probably still be awake checking it had worked.
Look the best solution would be to parameterise your config file and add things like when a certain entry should be used from. This would negate the need for any copying or swapping of files and your application would simply deal with it. (That goes for a config file approach or a database)
If you cannot change the current systems and you have to go with swapping the config files, then you also have two options:
Use a scheduled task to kick off a batch job or even a VBScript or PowerShell script (which ever you feel comfortable with) Make sure you set up the correct credentials to be able to do this at the middle of the night and you could also add some checking and mitigation into this approach.
Write a windows Service that does this for you. Here you have all the flexibility you need. Code it to do whatever it needs to do, do all the checks you need to (so that you can keep sleeping rather than making sure it actually worked) etc, etc. You service would then even take care of the scheduling aspect and all will be good. Here you could use xml DOM object and xPath and not replace the file, but simply update the specific entries as required.
Remember that any change to the config file would cause your site to restart, so make sure you take care of all the other housekeeping stuff that this could cause. (Although this would be exactly the same if you where sitting there in the middle of the night copying file around)

What can I do to prevent write-write conflicts on a wiki-style website?

On a wiki-style website, what can I do to prevent or mitigate write-write conflicts while still allowing the site to run quickly and keeping the site easy to use?
The problem I foresee is this:
User A begins editing a file
User B begins editing the file
User A finishes editing the file
User B finishes editing the file, accidentally overwriting all of User A's edits
Here were some approaches I came up with:
Have some sort of check-out / check-in / locking system (although I don't know how to prevent people from keeping a file checked out "too long", and I don't want users to be frustrated by not being allowed to make an edit)
Have some sort of diff system that shows an other changes made when a user commits their changes and allows some sort of merge (but I'm worried this will hard to create and would make the site "too hard" to use)
Notify users of concurrent edits while they are making their changes (some sort of AJAX?)
Any other ways to go at this? Any examples of sites that implement this well?
Remember the version number (or ID) of the last change. Then read the entry before writing it and compare if this version is still the same.
In case of a conflict inform the user who was trying to write the entry which was changed in the meantime. Support him with a diff.
Most wikis do it this way. MediaWiki, Usemod, etc.
Three-way merging: The first thing to point out is that most concurrent edits, particularly on longer documents, are to different sections of the text. As a result, by noting which revision Users A and B acquired, we can do a three-way merge, as detailed by Bill Ritcher of Guiffy Software. A three-way merge can identify where the edits have been made from the original, and unless they clash it can silently merge both edits into a new article. Ideally, at this point carry out the merge and show User B the new document so that she can choose to further revise it.
Collision resolution:
This leaves you with the scenario when both editors have edited the same section. In this case, merge everything else and offer the text of the three versions to User B - that is, include the original - with either User A's version in the textbox or User B's. That choice depends on whether you think the default should be to accept the latest (the user just clicks Save to retain their version) or force the editor to edit twice to get their changes in (they have to re-apply their changes to editor A's version of the section).
Using three-way merging like this avoids lock-outs, which are very difficult to handle well on the web (how long do you let them have the lock?), and the aggravating 'you might want to look again' scenario, which only works well for forum-style responses. It also retains the post-respond style of the web.
If you want to Ajax it up a bit, dynamically 3-way merge User A's version into User B's version while they are editing it, and notify them. Now that would be impressive.
In Mediawiki, the server accepts the first change, and then when the second edit is saved a conflicts page comes up, and then the second person merges the two changes together. See Wikipedia: Help:Edit Conflicts
Using a locking mechanism will probably be the easiest to implement. Each article could have a lock field associated with it and a lock time. If the lock time exceeded some set value you'd consider the lock to be invalid and remove it when checking out the article for edit. You could also keep track of open locks and remove them on session close. You'd also need to implement some concurrency control in the database (autogenerated timestamps, perhaps) so that you could make sure that you are checking in an update to the version that you checked out, just in case two people were able to edit the article at the same time. Only the one with the correct version would be able successfully check in an edit.
You might also be able to find a difference engine that you could just use to construct differences, though displaying them in a wiki editor may be problematic -- actually displaying the differences is probably harder than constructing the diff. You'd rely on the versioning system to detect when you needed to reject an edit and perform a diff.
In Gmail, if we are writing a reply to a mail and someone else sends a reply while we are still typing it, a popup appears indicating that there is a new update and the update itself appears as another post without a page reload. This approach would suit your needs and if you can use Ajax to show the exact post with a link to diff of what was just updated while User B is still busy typing his entry that would be great.
As Ravi (and others) have said, you could use an AJAX approach and inform the user when another change is in progress. When an edit is submitted, just indicate the textual differences and let the second user work out how to merge the two versions.
However, I'd like to add on with something new you could try in addition to that: Open a chat dialog between the editors while they're doing their edits. You could use something like embedded Gabbly for that, for instance.
The best conflict resolution is direct dialog, I say.
Your problem (lost update) is solved best using Optimistic Concurrency Control.
One implementation is to add a version column in each editable entity of the system. On user edit you load the row and display the html form on the user. A hidden field gives the version, let's say 3. The update query needs to look something like:
update articles set ..., version=4 where id=14 and version=3;
If rows returned is 0 then someone has already updated article 14. All you need to do then is how to deal with the situation. Some common solutions:
last commit wins
first commit wins
merge conflicting updates
let the user decide
Instead of an incrementing version int/long you can use a timestamp but it's not suggested because:
retrieving the current time from the JVM isn't necessarily safe in a clustered environment, where nodes may not be time synchronized.
(quote from Java Persistence with Hibernate)
Some more info at the hibernate documentation.
At my office, we have a policy that all data tables contain 4 fields:
CreatedBy
CreatedDate
LastUpdateBy
LastUpdateDate
That way there is a nice audit trail on who has done what to the records, at least most recently.
But most importantly, it becomes easy enough to compare the LastUpdateDate of the current or edited record on the screen (requires you to store it on the page, in a cookie, whatever, with the value in the database. If the values don't match, you can decide what to do from there.

how are serial generators / cracks developed?

I mean, I always was wondered about how the hell somebody can develop algorithms to break/cheat the constraints of legal use in many shareware programs out there.
Just for curiosity.
Apart from being illegal, it's a very complex task.
Speaking just at a teoretical level the common way is to disassemble the program to crack and try to find where the key or the serialcode is checked.
Easier said than done since any serious protection scheme will check values in multiple places and also will derive critical information from the serial key for later use so that when you think you guessed it, the program will crash.
To create a crack you have to identify all the points where a check is done and modify the assembly code appropriately (often inverting a conditional jump or storing costants into memory locations).
To create a keygen you have to understand the algorithm and write a program to re-do the exact same calculation (I remember an old version of MS Office whose serial had a very simple rule, the sum of the digit should have been a multiple of 7, so writing the keygen was rather trivial).
Both activities requires you to follow the execution of the application into a debugger and try to figure out what's happening. And you need to know the low level API of your Operating System.
Some heavily protected application have the code encrypted so that the file can't be disassembled. It is decrypted when loaded into memory but then they refuse to start if they detect that an in-memory debugger has started,
In essence it's something that requires a very deep knowledge, ingenuity and a lot of time! Oh, did I mention that is illegal in most countries?
If you want to know more, Google for the +ORC Cracking Tutorials they are very old and probably useless nowdays but will give you a good idea of what it means.
Anyway, a very good reason to know all this is if you want to write your own protection scheme.
The bad guys search for the key-check code using a disassembler. This is relative easy if you know how to do this.
Afterwards you translate the key-checking code to C or another language (this step is optional). Reversing the process of key-checking gives you a key-generator.
If you know assembler it takes roughly a weekend to learn how to do this. I've done it just some years ago (never released anything though. It was just research for my game-development job. To write a hard to crack key you have to understand how people approach cracking).
Nils's post deals with key generators. For cracks, usually you find a branch point and invert (or remove the condition) the logic. For example, you'll test to see if the software is registered, and the test may return zero if so, and then jump accordingly. You can change the "jump if equals zero (je)" to "jump if not-equals zero (jne)" by modifying a single byte. Or you can write no-operations over various portions of the code that do things that you don't want to do.
Compiled programs can be disassembled and with enough time, determined people can develop binary patches. A crack is simply a binary patch to get the program to behave differently.
First, most copy-protection schemes aren't terribly well advanced, which is why you don't see a lot of people rolling their own these days.
There are a few methods used to do this. You can step through the code in a debugger, which does generally require a decent knowledge of assembly. Using that you can get an idea of where in the program copy protection/keygen methods are called. With that, you can use a disassembler like IDA Pro to analyze the code more closely and try to understand what is going on, and how you can bypass it. I've cracked time-limited Betas before by inserting NOOP instructions over the date-check.
It really just comes down to a good understanding of software and a basic understanding of assembly. Hak5 did a two-part series on the first two episodes this season on kind of the basics of reverse engineering and cracking. It's really basic, but it's probably exactly what you're looking for.
A would-be cracker disassembles the program and looks for the "copy protection" bits, specifically for the algorithm that determines if a serial number is valid. From that code, you can often see what pattern of bits is required to unlock the functionality, and then write a generator to create numbers with those patterns.
Another alternative is to look for functions that return "true" if the serial number is valid and "false" if it's not, then develop a binary patch so that the function always returns "true".
Everything else is largely a variant on those two ideas. Copy protection is always breakable by definition - at some point you have to end up with executable code or the processor couldn't run it.
The serial number you can just extract the algorithm and start throwing "Guesses" at it and look for a positive response. Computers are powerful, usually only takes a little while before it starts spitting out hits.
As for hacking, I used to be able to step through programs at a high level and look for a point where it stopped working. Then you go back to the last "Call" that succeeded and step into it, then repeat. Back then, the copy protection was usually writing to the disk and seeing if a subsequent read succeeded (If so, the copy protection failed because they used to burn part of the floppy with a laser so it couldn't be written to).
Then it was just a matter of finding the right call and hardcoding the correct return value from that call.
I'm sure it's still similar, but they go through a lot of effort to hide the location of the call. Last one I tried I gave up because it kept loading code over the code I was single-stepping through, and I'm sure it's gotten lots more complicated since then.
I wonder why they don't just distribute personalized binaries, where the name of the owner is stored somewhere (encrypted and obfuscated) in the binary or better distributed over the whole binary.. AFAIK Apple is doing this with the Music files from the iTunes store, however there it's far too easy, to remove the name from the files.
I assume each crack is different, but I would guess in most cases somebody spends
a lot of time in the debugger tracing the application in question.
The serial generator takes that one step further by analyzing the algorithm that
checks the serial number for validity and reverse engineers it.

The best way to familiarize yourself with an inherited codebase

Stacker Nobody asked about the most shocking thing new programmers find as they enter the field.
Very high on the list, is the impact of inheriting a codebase with which one must rapidly become acquainted. It can be quite a shock to suddenly find yourself charged with maintaining N lines of code that has been clobbered together for who knows how long, and to have a short time in which to start contributing to it.
How do you efficiently absorb all this new data? What eases this transition? Is the only real solution to have already contributed to enough open-source projects that the shock wears off?
This also applies to veteran programmers. What techniques do you use to ease the transition into a new codebase?
I added the Community-Building tag to this because I'd also like to hear some war-stories about these transitions. Feel free to share how you handled a particularly stressful learning curve.
Pencil & Notebook ( don't get distracted trying to create a unrequested solution)
Make notes as you go and take an hour every monday to read thru and arrange the notes from previous weeks
with large codebases first impressions can be deceiving and issues tend to rearrange themselves rapidly while you are familiarizing yourself.
Remember the issues from your last work environment aren't necessarily valid or germane in your new environment. Beware of preconceived notions.
The notes/observations you make will help you learn quickly what questions to ask and of whom.
Hopefully you've been gathering the names of all the official (and unofficial) stakeholders.
One of the best ways to familiarize yourself with inherited code is to get your hands dirty. Start with fixing a few simple bugs and work your way into more complex ones. That will warm you up to the code better than trying to systematically review the code.
If there's a requirements or functional specification document (which is hopefully up-to-date), you must read it.
If there's a high-level or detailed design document (which is hopefully up-to-date), you probably should read it.
Another good way is to arrange a "transfer of information" session with the people who are familiar with the code, where they provide a presentation of the high level design and also do a walk-through of important/tricky parts of the code.
Write unit tests. You'll find the warts quicker, and you'll be more confident when the time comes to change the code.
Try to understand the business logic behind the code. Once you know why the code was written in the first place and what it is supposed to do, you can start reading through it, or as someone said, prolly fixing a few bugs here and there
My steps would be:
1.) Setup a source insight( or any good source code browser you use) workspace/project with all the source, header files, in the code base. Browsly at a higher level from the top most function(main) to lowermost function. During this code browsing, keep making notes on a paper/or a word document tracing the flow of the function calls. Do not get into function implementation nitti-gritties in this step, keep that for a later iterations. In this step keep track of what arguments are passed on to functions, return values, how the arguments that are passed to functions are initialized how the value of those arguments set modified, how the return values are used ?
2.) After one iteration of step 1.) after which you have some level of code and data structures used in the code base, setup a MSVC (or any other relevant compiler project according to the programming language of the code base), compile the code, execute with a valid test case, and single step through the code again from main till the last level of function. In between the function calls keep moting the values of variables passed, returned, various code paths taken, various code paths avoided, etc.
3.) Keep repeating 1.) and 2.) in iteratively till you are comfortable up to a point that you can change some code/add some code/find a bug in exisitng code/fix the bug!
-AD
I don't know about this being "the best way", but something I did at a recent job was to write a code spider/parser (in Ruby) that went through and built a call tree (and a reverse call tree) which I could later query. This was slightly non-trivial because we had PHP which called Perl which called SQL functions/procedures. Any other code-crawling tools would help in a similar fashion (i.e. javadoc, rdoc, perldoc, Doxygen etc.).
Reading any unit tests or specs can be quite enlightening.
Documenting things helps (either for yourself, or for other teammates, current and future). Read any existing documentation.
Of course, don't underestimate the power of simply asking a fellow teammate (or your boss!) questions. Early on, I asked as often as necessary "do we have a function/script/foo that does X?"
Go over the core libraries and read the function declarations. If it's C/C++, this means only the headers. Document whatever you don't understand.
The last time I did this, one of the comments I inserted was "This class is never used".
Do try to understand the code by fixing bugs in it. Do correct or maintain documentation. Don't modify comments in the code itself, that risks introducing new bugs.
In our line of work, generally speaking we do no changes to production code without good reason. This includes cosmetic changes; even these can introduce bugs.
No matter how disgusting a section of code seems, don't be tempted to rewrite it unless you have a bugfix or other change to do. If you spot a bug (or possible bug) when reading the code trying to learn it, record the bug for later triage, but don't attempt to fix it.
Another Procedure...
After reading Andy Hunt's "Pragmatic Thinking and Learning - Refactor Your Wetware" (which doesn't address this directly), I picked up a few tips that may be worth mentioning:
Observe Behavior:
If there's a UI, all the better. Use the app and get a mental map of relationships (e.g. links, modals, etc). Look at HTTP request if it helps, but don't put too much emphasis on it -- you just want a light, friendly acquaintance with app.
Acknowledge the Folder Structure:
Once again, this is light. Just see what belongs where, and hope that the structure is semantic enough -- you can always get some top-level information from here.
Analyze Call-Stacks, Top-Down:
Go through and list on paper or some other medium, but try not to type it -- this gets different parts of your brain engaged (build it out of Legos if you have to) -- function-calls, Objects, and variables that are closest to top-level first. Look at constants and modules, make sure you don't dive into fine-grained features if you can help it.
MindMap It!:
Maybe the most important step. Create a very rough draft mapping of your current understanding of the code. Make sure you run through the mindmap quickly. This allows an even spread of different parts of your brain to (mostly R-Mode) to have a say in the map.
Create clouds, boxes, etc. Wherever you initially think they should go on the paper. Feel free to denote boxes with syntactic symbols (e.g. 'F'-Function, 'f'-closure, 'C'-Constant, 'V'-Global Var, 'v'-low-level var, etc). Use arrows: Incoming array for arguments, Outgoing for returns, or what comes more naturally to you.
Start drawing connections to denote relationships. Its ok if it looks messy - this is a first draft.
Make a quick rough revision. Its its too hard to read, do another quick organization of it, but don't do more than one revision.
Open the Debugger:
Validate or invalidate any notions you had after the mapping. Track variables, arguments, returns, etc.
Track HTTP requests etc to get an idea of where the data is coming from. Look at the headers themselves but don't dive into the details of the request body.
MindMap Again!:
Now you should have a decent idea of most of the top-level functionality.
Create a new MindMap that has anything you missed in the first one. You can take more time with this one and even add some relatively small details -- but don't be afraid of what previous notions they may conflict with.
Compare this map with your last one and eliminate any question you had before, jot down new questions, and jot down conflicting perspectives.
Revise this map if its too hazy. Revise as much as you want, but keep revisions to a minimum.
Pretend Its Not Code:
If you can put it into mechanical terms, do so. The most important part of this is to come up with a metaphor for the app's behavior and/or smaller parts of the code. Think of ridiculous things, seriously. If it was an animal, a monster, a star, a robot. What kind would it be. If it was in Star Trek, what would they use it for. Think of many things to weigh it against.
Synthesis over Analysis:
Now you want to see not 'what' but 'how'. Any low-level parts that through you for a loop could be taken out and put into a sterile environment (you control its inputs). What sort of outputs are you getting. Is the system more complex than you originally thought? Simpler? Does it need improvements?
Contribute Something, Dude!:
Write a test, fix a bug, comment it, abstract it. You should have enough ability to start making minor contributions and FAILING IS OK :)! Note on any changes you made in commits, chat, email. If you did something dastardly, you guys can catch it before it goes to production -- if something is wrong, its a great way to get a teammate to clear things up for you. Usually listening to a teammate talk will clear a lot up that made your MindMaps clash.
In a nutshell, the most important thing to do is use a top-down fashion of getting as many different parts of your brain engaged as possible. It may even help to close your laptop and face your seat out the window if possible. Studies have shown that enforcing a deadline creates a "Pressure Hangover" for ~2.5 days after the deadline, which is why deadlines are often best to have on a Friday. So, BE RELAXED, THERE'S NO TIMECRUNCH, AND NOW PROVIDE YOURSELF WITH AN ENVIRONMENT THAT'S SAFE TO FAIL IN. Most of this can be fairly rushed through until you get down to details. Make sure that you don't bypass understanding of high-level topics.
Hope this helps you as well :)
All really good answers here. Just wanted to add few more things:
One can pair architectural understanding with flash cards and re-visiting those can solidify understanding. I find questions such as "Which part of code does X functionality ?", where X could be a useful functionality in your code base.
I also like to open a buffer in emacs and start re-writing some parts of the code base that I want to familiarize myself with and add my own comments etc.
One thing vi and emacs users can do is use tags. Tags are contained in a file ( usually called TAGS ). You generate one or more tags files by a command ( etags for emacs vtags for vi ). Then we you edit source code and you see a confusing function or variable you load the tags file and it will take you to where the function is declared ( not perfect by good enough ). I've actually written some macros that let you navigate source using Alt-cursor,
sort of like popd and pushd in many flavors of UNIX.
BubbaT
The first thing I do before going down into code is to use the application (as several different users, if necessary) to understand all the functionalities and see how they connect (how information flows inside the application).
After that I examine the framework in which the application was built, so that I can make a direct relationship between all the interfaces I have just seen with some View or UI code.
Then I look at the database and any database commands handling layer (if applicable), to understand how that information (which users manipulate) is stored and how it goes to and comes from the application
Finally, after learning where data comes from and how it is displayed I look at the business logic layer to see how data gets transformed.
I believe every application architecture can de divided like this and knowning the overall function (a who is who in your application) might be beneficial before really debugging it or adding new stuff - that is, if you have enough time to do so.
And yes, it also helps a lot to talk with someone who developed the current version of the software. However, if he/she is going to leave the company soon, keep a note on his/her wish list (what they wanted to do for the project but were unable to because of budget contraints).
create documentation for each thing you figured out from the codebase.
find out how it works by exprimentation - changing a few lines here and there and see what happens.
use geany as it speeds up the searching of commonly used variables and functions in the program and adds it to autocomplete.
find out if you can contact the orignal developers of the code base, through facebook or through googling for them.
find out the original purpose of the code and see if the code still fits that purpose or should be rewritten from scratch, in fulfillment of the intended purpose.
find out what frameworks did the code use, what editors did they use to produce the code.
the easiest way to deduce how a code works is by actually replicating how a certain part would have been done by you and rechecking the code if there is such a part.
it's reverse engineering - figuring out something by just trying to reengineer the solution.
most computer programmers have experience in coding, and there are certain patterns that you could look up if that's present in the code.
there are two types of code, object oriented and structurally oriented.
if you know how to do both, you're good to go, but if you aren't familiar with one or the other, you'd have to relearn how to program in that fashion to understand why it was coded that way.
in objected oriented code, you can easily create diagrams documenting the behaviors and methods of each object class.
if it's structurally oriented, meaning by function, create a functions list documenting what each function does and where it appears in the code..
i haven't done either of the above myself, as i'm a web developer it is relatively easy to figure out starting from index.php to the rest of the other pages how something works.
goodluck.