How do you refactor a class that is constantly being edited? - language-agnostic

Over the course of time, my team has created a central class that handles an agglomeration of responsibilities and runs to over 8,000 lines, all of it hand-written, not auto-generated.
The mandate has come down. We need to refactor the monster class. The biggest part of the plan is to define categories of functionality into their own classes with a has-a relationship with the monster class.
That means that a lot of references that currently read like this:
var monster = new orMonster();
var timeToOpen = monster.OpeningTime.Subtract(DateTime.Now);
will soon read like this:
var monster = new Monster();
var timeToOpen = monster.TimeKeeper.OpeningTime.Subtract(DateTime.Now);
The question is: How on Earth do we coordinate such a change? References to "orMonster" litter every single business class. Some methods are called in literally thousands of places in the code. It's guaranteed that, any time we make such a chance, someone else (probably multiple someone elses) on the team will have code checked out that calls the .OpeningTime property
How do you coordinate such a large scale change without productivity grinding to a halt?

You should make the old method call the new method. Then over time change the references to the old method to call the new method instead. Once all the client references are changed, you can delete the old method.
For more information, see Move Method in Martin Fowler's classic, Refactoring.

One thing you can do is to temporarily leave proxy methods in the monster class that will delegate to the new method. After a week or so, once you are sure all code is using the new method, then you can safely remove the proxy.

I've handled this before by going ahead and refactoring the code, but then adding methods that match the old signature that forward the calls to the new method. If you add the "Obsolete" attribute to these temporary methods, your code will still build with both the old method calls and the new method calls. Then over time you can go back through and upgrade the code that is calling the old method. The difference here is that you'll get "Warnings" during the build to help you find all of the code that needs upgrading.

I'm not sure what language you're using but in .Net you can create compiler warnings which will allow you to leave the old references for a time so that they will function as expected but place a warning for your other developers to see.
http://dotnettipoftheday.org/tips/ObsoleteAttribute.aspx

Develop your changes in a branch. Break out a subset of code to a new class, make changes across the client base, test thoroughly, and then merge back.
That concentrates the breakage to when you merge — not the entire development cycle.
Combine this with Patrick's suggestion to have the monster call the small monsters. That'll let you easily revert if your merged client code breaks changes to that client. As Patrick says, you'll be able to remove the monster's methods (now stubs) once you prove nobody's using it.
I also echo several posters' advice to expose the broken out classes directly — not via the monster. Why apply only half a cure? With the same effort, you could apply a complete cure.
Finally: write unit tests. Write lots of unit tests. Oh, boy, do you need unit tests to safely pull this one off. Did I mention you need unit tests?

Keep the old method in place and forward to the new method (as others have said) but also send a log message in the forwarding method to remind yourself to remove it.
You could just add a comment but that's too easy to miss.

Suggest using a tool such as nDepend to identify all of the references to the class methods. The output from nDepend can be used to give you a better idea about how to group the methods.

var monster = new Monster();
var timeToOpen = monster.TimeKeeper.OpeningTime.Subtract(DateTime.Now);
I'm not sure that divvying it up and just making portions of it publically available is any better. That's violating the law of demeter and can lead to NullReference pain.
I'd suggest exposing timekeeper to people without involving the monster.
If anything you'd be well off analysing the API and seeing what you can cut and encapsulate within monster. Certainly giving monster toys to play with as opposed to making monster do all of the work itself is a good call. The main effort is defining the toys monster needs to simplify his work.

Don't refactor it.
Start over and follow the law of demeter. Create a second monster class and start from scratch. When the second monster class is finished and working, then replace occurrences of the first monster. Swap it out. Hopefully they share an interface, or you can make that happen.
And instead of this: "monster.TimeKeeper.OpeningTime.Subtract(DateTime.Now)"
Do this: monster.SubtractOpeningTime(DateTime.Now). Don't kill yourself with dot-notation (hence the demeter)

Several people have provided good answers regarding the orchestration of the refactor itself. That's key. But you also asked about coordinating the changes between multiple people (which I think was the crux of your question). What source control are you using? Anything like CVS, SVN, etc can handle incoming changes from multiple developers at once. The key to making it go smoothly is that each person must make their commits granular and atomic, and each developer should pull other people's commits often.

I will look at first using partial class to split the single monster class over many files, grouping methods into categories.
You will need to stop anyone editing the monster class while you split in into the files.
From then on you are likely to get less merge conflicts as there will be less edits to each file. You can then change each method in the monster class, (one method per checkin) to call your new classes.

Such a huge class is really an issue. Since it grew so big and nobody felt uncomfortable, there must be something wrong with project policies. I'd say you should split into pairs and do pair programming. Create a branch for every pair of programmers. Work for 1-2 days on refactoring. Compare your results. This will help you avoid the situation when the refactoring would go from the start into the wrong direction and finally that would lead to the need of rewriting the monster class from scratch.

Related

Do we need to avoid duplicate at any cost?

In our application framework ,whenever a code duplicate starts happen (even in slightest degree) people immediately wanted it to be removed and added as separate function. Though it is good ,when it is being done at the early stage of development soon that function need to be changed in an unexpected way. Then they add if condition and other condition to make that function stay.
Is duplicate really evil? I feel we can wait for triplication (duplicate in three places ) to occur before making it as a function. But i am being seen as alien if i suggest this. Do we need to avoid duplicate at any cost ?
There is a quote from Donald Knuth that pin points your question:
The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.
I think that duplication is needed so we are able to understand what are the critical points that need to be DRYed.
Duplicated code is something that should be avoided but there are some (less frequent) contexts where abstractions may lead you to poor code, for example when writing test specs some degree of repetition may keep your code easier to maintain.
I guess you should always ask what is the gain of another abstraction and if there is no quick win you may leave some duplicated code and add TODO comment, this will make you code faster, deliver earlier and DRY only what is most valued.
A similar approach is described by Three Strikes And You Refactor:
The first time you do something, you just do it.
The second time you do something similar, you wince at the duplication, but you do the duplicate thing anyway.
(Or if you're like me, you don't wince just yet -- because your DuplicationRefactoringThreshold is 3 or 4.)
The third time you do something similar, you refactor.
This is the holy war between management, developers and QC.
So to make this war end you need to make a conman concept for all three of them.
Abstraction VS Encapsulation, Interface VS Class, Singleton VS Static.
you all need to give yourself time to understand each other point of view to make this work.
For example if you are using scrum then you can go sprints for developing then make a sprint for abstraction and encapsulation.

Test Driven Design - where did I go wrong?

I am playing with a toy project at home to better understand Test Driven Design. At first things seemed to be going well and I got into the swing of failing tests, code, passing test.
I then came to add a test and realised it would be difficult with my current structure and that furthermore I should split a particular class which had too many responsibilities. Adding even more responsibilities for the next test was clearly wrong. I decided to put aside this test, and refactor what I had. This is where things started to go wrong.
It was difficult to refactor without breaking lots of tests at once, and then the only option seemed to be to make many changes and hope I ended up back at something where the tests passed again. The tests themselves were valid, I just had to break nearly all of them while refactoring. The refactoring (which I'm still not that happy with) took me five or six hours before I had returned to all tests passing. The tests did help me along the way.
It feels like I got off the TDD track. What do you think I did wrong?
As this is mostly a learning exercise I'm considering rolling back all that refactoring and trying to move forward again in a better fashion.
Perhaps you went too fast when splitting your class. The steps for the Extract Class Refactoring are as follows:
create the new class
have an instance of that class as a private data member
move field to the new class, one by one
compile and test for each field
move method to the new class, one by one, testing for each
That way you won't break a large number of tests while refactoring your class, and you can rely on the tests to make sure nothing was broken so far all along the class splitting.
Also, make sure you're testing the behavior, not the implementation.
I wanted to comment the accepted answer but my current reputation does not allow me. So here it is as a new answer.
TDD says:
Create a test that fails. Code a little. Make the test pass.
It insists on coding in tiny steps (especially when beginning). View TDD as systematic validation of successive refactorings you perform to build your programs. If you take too big a step, your refactoring will get out of control.
Perhaps you were testing a too low a level. It's hard to say without seeing you code, but typically I test a feature from end to end, and make sure all the behaviour I expected to happen, happened. Testing every single method in isolation will give you the test-web you have created.
You can use tools like NCover and DotCover to check that you haven't missed any code paths.
The only thing "wrong" was to add a test afterwards. In "true" TTD you first declare all tests before the actual implementation. I say "true" because that's often only theory. But in practice you still have the safty given by the tests.
This is, sadly, something TDD proponents don't talk about enough and that makes people try TDD and then abandon it.
What I do is what I call "High Level Testing" which consists in avoiding unit tests and doing exclusively high level tests (which we could could call "integration tests"). It works pretty well and I avoid the (very important) problem you mentioned. I wrote an article about it a while ago:
http://www.hardcoded.net/articles/high-level-testing.htm
Good luck with TDD, don't give up yet.
Also for TDD continuos regression of test cases are also necessary. So Continuos integration with Coverage tools(as mentioned above) are necessary. So that small changes(i.e Refactoring) can be regressed easily and whether any code path which are missed can be found easily.
Also I feel that if tests were not written previously, time should not be wasted to think whether to write tests or not. Tests should be written immediately.

Most common examples of misuse of singleton class

When should you NOT use a singleton class although it might be very tempting to do so? It would be very nice if we had a list of most common instances of 'singletonitis' that we should take care to avoid.
Do not use a singleton for something that might evolve into a multipliable resource.
This probably sounds silly, but if you declare something a singleton you're making a very strong statement that it is absolutely unique. You're building code around it, more and more. And when you then find out after thousands of lines of code that it is not a singleton at all, you have a huge amount of work in front of you because all the other objects expect "the" sacred object of class WizBang to be a singleton.
Typical example: "There is only one database connection this application has, thus it is a singleton." - Bad idea. You may want to have several connections in the future. Better create a pool of database connections and populate it with just one instance. Acts like a Singleton, but all other code will have growable code for accessing the pool.
EDIT: I understand that theoretically you can extend a singleton into several objects. Yet there is no real life cycle (like pooling/unpooling) which means there is no real ownership of objects that have been handed out, i.e. the now multi-singleton would have to be stateless to be used simultaneously by different methods and threads.
Well singletons for the most part are just making things static anyway. So you're either in effect making data global, and we all know global variables are bad or you're writing static methods and that's not very OO now is it?
Here is a more detailed rant on why singletons are bad, by Steve Yegge. Basically you shouldn't use singletons in almost all cases, you can't really know that it's never going to be needed in more than one place.
I know many have answered with "when you have more than one", etc.
Since the original poster wanted a list of cases when you shouldn't use Singletons (rather than the top reason), I'll chime in with:
Whenever you're using it because you're not allowed to use a global!
The number of times I've had a junior engineer who has used a Singleton because they knew that I didn't accept globals in code-reviews. They often seem shocked when I point out that all they did was replace a global with a Singleton pattern and they still just have a global!
Here is a rant by my friend Alex Miller... It does not exactly enumerate "when you should NOT use a singleton" but it is a comprehensive, excellent post and argues that one should only use a singleton in rare instances, if at all.
I'm guilty of a big one a few years back (thankfully I've learned my lession since then).
What happened is that I came on board a desktop app project that had converted to .Net from VB6, and was a real mess. Things like 40-page (printed) functions and no real class structure. I built a class to encapsulate access to the database. Not a real data tier (yet), just a base class that a real data tier could use. Somewhere I got the bright idea to make this class a singleton. It worked okay for a year or so, and then we needed to build a web interface for the app as well. The singleton ended up being a huge bottleneck for the database, since all web users had to share the same connection. Again... lesson learned.
Looking back, it probably actually was the right choice for a short while, since it forced the other developers to be more disciplined about using it and made them aware of scoping issues not previously a problem in the VB6 world. But I should have changed it back after a few weeks before we had too much built up around it.
Singletons are virtually always a bad idea and generally useless/redundant since they are just a very limited simplification of a decent pattern.
Look up how Dependency Injection works. It solves the same problems, but in a much more useful way--in fact, you find it applies to many more parts of your design.
Although you can find DI libraries out there, you can also roll a basic one yourself, it's pretty easy.
I try to have only one singleton - an inversion of control / service locator object.
IService service = IoC.GetImplementationOf<IService>();
One of the things that tend to make it a nightmare is if it contains modifiable global state. I worked on a project where there were Singletons used all over the place for things that should have been solved in a completely different way (pass in strategies etc.) The "de-singletonification" was in some cases a major rewrite of parts of the system. I would argue that in the bigger part of the cases when people use a Singleton, it's just wrong b/c it looks nice in the first place, but turns into a problem especially in testing.
When you have multiple applications running in the same JVM.
A singleton is a singleton across the entire JVM, not just a single application. Even if multiple threads or applications seems to be creating a new singleton object, they're all using the same one if they run in the same JVM.
Sometimes, you assume there will only be one of a thing, then you turn out to be wrong.
Example, a database class. You assume you will only ever connect to your app's database.
// Its our database! We'll never need another
class Database
{
};
But wait! Your boss says, hook up to some other guys database. Say you want to add phpbb to the website and would like to poke its database to integrate some of its functionality. Should we make a new singleton or another instance of database? Most people agree that a new instance of the same class is preferred, there is no code duplication.
You'd rather have
Database ourDb;
Database otherDb;
than copy-past Database and make:
// Copy-pasted from our home-grown database.
class OtherGuysDatabase
{
};
The slippery slope here is that you might stop thinking about making new instance of classes and instead begin thinking its ok to have one type per every instance.
In the case of a connection (for instance), it makes sense that you wouldn't want to make the connection itself a singleton, you might need four connections, or you may need to destroy and recreate the connection a number of times.
But why wouldn't you access all of your connections through a single interface (i.e. connection manager)?

The best way to familiarize yourself with an inherited codebase

Stacker Nobody asked about the most shocking thing new programmers find as they enter the field.
Very high on the list, is the impact of inheriting a codebase with which one must rapidly become acquainted. It can be quite a shock to suddenly find yourself charged with maintaining N lines of code that has been clobbered together for who knows how long, and to have a short time in which to start contributing to it.
How do you efficiently absorb all this new data? What eases this transition? Is the only real solution to have already contributed to enough open-source projects that the shock wears off?
This also applies to veteran programmers. What techniques do you use to ease the transition into a new codebase?
I added the Community-Building tag to this because I'd also like to hear some war-stories about these transitions. Feel free to share how you handled a particularly stressful learning curve.
Pencil & Notebook ( don't get distracted trying to create a unrequested solution)
Make notes as you go and take an hour every monday to read thru and arrange the notes from previous weeks
with large codebases first impressions can be deceiving and issues tend to rearrange themselves rapidly while you are familiarizing yourself.
Remember the issues from your last work environment aren't necessarily valid or germane in your new environment. Beware of preconceived notions.
The notes/observations you make will help you learn quickly what questions to ask and of whom.
Hopefully you've been gathering the names of all the official (and unofficial) stakeholders.
One of the best ways to familiarize yourself with inherited code is to get your hands dirty. Start with fixing a few simple bugs and work your way into more complex ones. That will warm you up to the code better than trying to systematically review the code.
If there's a requirements or functional specification document (which is hopefully up-to-date), you must read it.
If there's a high-level or detailed design document (which is hopefully up-to-date), you probably should read it.
Another good way is to arrange a "transfer of information" session with the people who are familiar with the code, where they provide a presentation of the high level design and also do a walk-through of important/tricky parts of the code.
Write unit tests. You'll find the warts quicker, and you'll be more confident when the time comes to change the code.
Try to understand the business logic behind the code. Once you know why the code was written in the first place and what it is supposed to do, you can start reading through it, or as someone said, prolly fixing a few bugs here and there
My steps would be:
1.) Setup a source insight( or any good source code browser you use) workspace/project with all the source, header files, in the code base. Browsly at a higher level from the top most function(main) to lowermost function. During this code browsing, keep making notes on a paper/or a word document tracing the flow of the function calls. Do not get into function implementation nitti-gritties in this step, keep that for a later iterations. In this step keep track of what arguments are passed on to functions, return values, how the arguments that are passed to functions are initialized how the value of those arguments set modified, how the return values are used ?
2.) After one iteration of step 1.) after which you have some level of code and data structures used in the code base, setup a MSVC (or any other relevant compiler project according to the programming language of the code base), compile the code, execute with a valid test case, and single step through the code again from main till the last level of function. In between the function calls keep moting the values of variables passed, returned, various code paths taken, various code paths avoided, etc.
3.) Keep repeating 1.) and 2.) in iteratively till you are comfortable up to a point that you can change some code/add some code/find a bug in exisitng code/fix the bug!
-AD
I don't know about this being "the best way", but something I did at a recent job was to write a code spider/parser (in Ruby) that went through and built a call tree (and a reverse call tree) which I could later query. This was slightly non-trivial because we had PHP which called Perl which called SQL functions/procedures. Any other code-crawling tools would help in a similar fashion (i.e. javadoc, rdoc, perldoc, Doxygen etc.).
Reading any unit tests or specs can be quite enlightening.
Documenting things helps (either for yourself, or for other teammates, current and future). Read any existing documentation.
Of course, don't underestimate the power of simply asking a fellow teammate (or your boss!) questions. Early on, I asked as often as necessary "do we have a function/script/foo that does X?"
Go over the core libraries and read the function declarations. If it's C/C++, this means only the headers. Document whatever you don't understand.
The last time I did this, one of the comments I inserted was "This class is never used".
Do try to understand the code by fixing bugs in it. Do correct or maintain documentation. Don't modify comments in the code itself, that risks introducing new bugs.
In our line of work, generally speaking we do no changes to production code without good reason. This includes cosmetic changes; even these can introduce bugs.
No matter how disgusting a section of code seems, don't be tempted to rewrite it unless you have a bugfix or other change to do. If you spot a bug (or possible bug) when reading the code trying to learn it, record the bug for later triage, but don't attempt to fix it.
Another Procedure...
After reading Andy Hunt's "Pragmatic Thinking and Learning - Refactor Your Wetware" (which doesn't address this directly), I picked up a few tips that may be worth mentioning:
Observe Behavior:
If there's a UI, all the better. Use the app and get a mental map of relationships (e.g. links, modals, etc). Look at HTTP request if it helps, but don't put too much emphasis on it -- you just want a light, friendly acquaintance with app.
Acknowledge the Folder Structure:
Once again, this is light. Just see what belongs where, and hope that the structure is semantic enough -- you can always get some top-level information from here.
Analyze Call-Stacks, Top-Down:
Go through and list on paper or some other medium, but try not to type it -- this gets different parts of your brain engaged (build it out of Legos if you have to) -- function-calls, Objects, and variables that are closest to top-level first. Look at constants and modules, make sure you don't dive into fine-grained features if you can help it.
MindMap It!:
Maybe the most important step. Create a very rough draft mapping of your current understanding of the code. Make sure you run through the mindmap quickly. This allows an even spread of different parts of your brain to (mostly R-Mode) to have a say in the map.
Create clouds, boxes, etc. Wherever you initially think they should go on the paper. Feel free to denote boxes with syntactic symbols (e.g. 'F'-Function, 'f'-closure, 'C'-Constant, 'V'-Global Var, 'v'-low-level var, etc). Use arrows: Incoming array for arguments, Outgoing for returns, or what comes more naturally to you.
Start drawing connections to denote relationships. Its ok if it looks messy - this is a first draft.
Make a quick rough revision. Its its too hard to read, do another quick organization of it, but don't do more than one revision.
Open the Debugger:
Validate or invalidate any notions you had after the mapping. Track variables, arguments, returns, etc.
Track HTTP requests etc to get an idea of where the data is coming from. Look at the headers themselves but don't dive into the details of the request body.
MindMap Again!:
Now you should have a decent idea of most of the top-level functionality.
Create a new MindMap that has anything you missed in the first one. You can take more time with this one and even add some relatively small details -- but don't be afraid of what previous notions they may conflict with.
Compare this map with your last one and eliminate any question you had before, jot down new questions, and jot down conflicting perspectives.
Revise this map if its too hazy. Revise as much as you want, but keep revisions to a minimum.
Pretend Its Not Code:
If you can put it into mechanical terms, do so. The most important part of this is to come up with a metaphor for the app's behavior and/or smaller parts of the code. Think of ridiculous things, seriously. If it was an animal, a monster, a star, a robot. What kind would it be. If it was in Star Trek, what would they use it for. Think of many things to weigh it against.
Synthesis over Analysis:
Now you want to see not 'what' but 'how'. Any low-level parts that through you for a loop could be taken out and put into a sterile environment (you control its inputs). What sort of outputs are you getting. Is the system more complex than you originally thought? Simpler? Does it need improvements?
Contribute Something, Dude!:
Write a test, fix a bug, comment it, abstract it. You should have enough ability to start making minor contributions and FAILING IS OK :)! Note on any changes you made in commits, chat, email. If you did something dastardly, you guys can catch it before it goes to production -- if something is wrong, its a great way to get a teammate to clear things up for you. Usually listening to a teammate talk will clear a lot up that made your MindMaps clash.
In a nutshell, the most important thing to do is use a top-down fashion of getting as many different parts of your brain engaged as possible. It may even help to close your laptop and face your seat out the window if possible. Studies have shown that enforcing a deadline creates a "Pressure Hangover" for ~2.5 days after the deadline, which is why deadlines are often best to have on a Friday. So, BE RELAXED, THERE'S NO TIMECRUNCH, AND NOW PROVIDE YOURSELF WITH AN ENVIRONMENT THAT'S SAFE TO FAIL IN. Most of this can be fairly rushed through until you get down to details. Make sure that you don't bypass understanding of high-level topics.
Hope this helps you as well :)
All really good answers here. Just wanted to add few more things:
One can pair architectural understanding with flash cards and re-visiting those can solidify understanding. I find questions such as "Which part of code does X functionality ?", where X could be a useful functionality in your code base.
I also like to open a buffer in emacs and start re-writing some parts of the code base that I want to familiarize myself with and add my own comments etc.
One thing vi and emacs users can do is use tags. Tags are contained in a file ( usually called TAGS ). You generate one or more tags files by a command ( etags for emacs vtags for vi ). Then we you edit source code and you see a confusing function or variable you load the tags file and it will take you to where the function is declared ( not perfect by good enough ). I've actually written some macros that let you navigate source using Alt-cursor,
sort of like popd and pushd in many flavors of UNIX.
BubbaT
The first thing I do before going down into code is to use the application (as several different users, if necessary) to understand all the functionalities and see how they connect (how information flows inside the application).
After that I examine the framework in which the application was built, so that I can make a direct relationship between all the interfaces I have just seen with some View or UI code.
Then I look at the database and any database commands handling layer (if applicable), to understand how that information (which users manipulate) is stored and how it goes to and comes from the application
Finally, after learning where data comes from and how it is displayed I look at the business logic layer to see how data gets transformed.
I believe every application architecture can de divided like this and knowning the overall function (a who is who in your application) might be beneficial before really debugging it or adding new stuff - that is, if you have enough time to do so.
And yes, it also helps a lot to talk with someone who developed the current version of the software. However, if he/she is going to leave the company soon, keep a note on his/her wish list (what they wanted to do for the project but were unable to because of budget contraints).
create documentation for each thing you figured out from the codebase.
find out how it works by exprimentation - changing a few lines here and there and see what happens.
use geany as it speeds up the searching of commonly used variables and functions in the program and adds it to autocomplete.
find out if you can contact the orignal developers of the code base, through facebook or through googling for them.
find out the original purpose of the code and see if the code still fits that purpose or should be rewritten from scratch, in fulfillment of the intended purpose.
find out what frameworks did the code use, what editors did they use to produce the code.
the easiest way to deduce how a code works is by actually replicating how a certain part would have been done by you and rechecking the code if there is such a part.
it's reverse engineering - figuring out something by just trying to reengineer the solution.
most computer programmers have experience in coding, and there are certain patterns that you could look up if that's present in the code.
there are two types of code, object oriented and structurally oriented.
if you know how to do both, you're good to go, but if you aren't familiar with one or the other, you'd have to relearn how to program in that fashion to understand why it was coded that way.
in objected oriented code, you can easily create diagrams documenting the behaviors and methods of each object class.
if it's structurally oriented, meaning by function, create a functions list documenting what each function does and where it appears in the code..
i haven't done either of the above myself, as i'm a web developer it is relatively easy to figure out starting from index.php to the rest of the other pages how something works.
goodluck.

Best use pattern for a DataContext

What's the best lifetime model for a DataContext? Should I just create a new one whenever I need it (aka, function level), should I keep one available in each class that would use it (class level), or should I create a static class with a static DataContext (app-domain level)? Are there any considered best practices on this?
You pretty much need to keep the same data context available throughout the lifetime of the operations you want to perform if you're ever going to be storing changes which are to be .SubmitChanges()'d later, as otherwise you will lose those changes.
If you're just querying stuff then it's fine to create them as needed, but then if later you want to .SubmitChanges() you'll have to refactor your code a lot, so you may as well adopt the pattern of effectively keeping the datacontext global throughout your app from the beginning.
Note the data context is disconnected. The connection is only made when the query data is enumerated (not when you first run the query, it's a 'lazy' data type so only provides data when it's needed), and then closed immediately afterwards. On .SubmitChanges() the connection is opened to submit the changes then closed immediately afterwards. So don't think keeping the datacontext around keeps a connection open, it doesn't (you can hook the StateChange event of the connection to confirm this for yourself, that's how I'm sure).
There is a great article over at Rick Strahl's Blog which covers this topic in depth, far more than my answer here provides!!
I think Jeff Atwood talked about this in the Herding Code podcast, when he was questioned about the exact same thing. Listen to it towards the last 15-20 minutes or so.
I think in SO, the datacontext is created in the Controller class. Not sure about a lot of details here. But that's what it looked like.