Many people have argued about function size. They say that functions in general should be pretty short. Opinions vary from something like 15 lines to "about one screen", which today is probably about 40-80 lines.
Also, functions should always fulfill one task only.
However, there is one kind of function that frequently fails in both criteria in my code: Initialization functions.
For example in an audio application, the audio hardware/API has to be set up, audio data has to be converted to a suitable format and the object state has to properly initialized. These are clearly three different tasks and depending on the API this can easily span more than 50 lines.
The thing with init-functions is that they are generally only called once, so there is no need to re-use any of the components. Would you still break them up into several smaller functions would you consider big initialization functions to be ok?
I would still break the function up by task, and then call each of the lower level functions from within my public-facing initialize function:
void _init_hardware() { }
void _convert_format() { }
void _setup_state() { }
void initialize_audio() {
_init_hardware();
_convert_format();
_setup_state();
}
Writing succinct functions is as much about isolating fault and change as keeping things readable. If you know the failure is in _convert_format(), you can track down the ~40 lines responsible for a bug quite a bit faster. The same thing applies if you commit changes that only touch one function.
A final point, I make use of assert() quite frequently so I can "fail often and fail early", and the beginning of a function is the best place for a couple of sanity-checking asserts. Keeping the function short allows you to test the function more thoroughly based on its more narrow set of duties. It's very hard to unit-test a 400 line function that does 10 different things.
If breaking into smaller parts makes code better structured and/or more readable - do it no matter what the function does. It not about the number of lines it's about code quality.
I would still try to break up the functions into logical units. They should be as long or as short as makes sense. For example:
SetupAudioHardware();
ConvertAudioData();
SetupState();
Assigning them clear names makes everything more intuitive and readable. Also, breaking them apart makes it easier for future changes and/or other programs to reuse them.
In a situation like this I think it comes down to a matter of personal preference. I prefer to have functions do only one thing so I would split the initialization into separate functions, even if they are only called once. However, if someone wanted to do it all in a single function I wouldn't worry about it too much (as long as the code was clear). There are more important things to argue about (like whether curly braces belong on their own separate line).
If you have a lot of components the need to be plugged into each other, it can certainly be reasonably natural to have a large method - even if the creation of each component is refactored into a separate method where feasible.
One alternative to this is to use a Dependency Injection framework (e.g. Spring, Castle Windsor, Guice etc). That has definite pros and cons... while working your way through one big method can be quite painful, you at least have a good idea of where everything is initialized, and there's no need to worry about what "magic" might be going on. Then again, the initialization can't be changed after deployment (as it can with an XML file for Spring, for example).
I think it makes sense to design the main body of your code so that it can be injected - but whether that injection is via a framework or just a hard-coded (and potentially long) list of initialization calls is a choice which may well change for different projects. In both cases the results are hard to test other than by just running the application.
First, a factory should be used instead of an initialization function. That is, rather than have initialize_audio(), you have a new AudioObjectFactory (you can think of a better name here). This maintains separation of concerns.
However, be careful also not to abstract too early. Clearly you do have two concerns already: 1) audio initialization and 2) using that audio. Until, for example, you abstract the audio device to be initialized, or the way a given device may be configured during initialization, your factory method (audioObjectFactory.Create() or whatever), should really be kept to just one big method. Early abstraction serves only to obfuscate design.
Note that audioObjectFactory.Create() is not something that can be unit-tested. Testing it is an integration test, and until there are parts of it that can be abstracted, it will remain an integration test. Later on, you may find that the you have multiple different factories for different configurations; at that point, it might be beneficial to abstract the hardware calls into an interface, so you that you can create unit tests to ensure the various factories configure the hardware in a proper way.
I think it's the wrong approach to try and count the number of lines and determine functions based on that. For something like initialization code I often have a separate function for it, but mostly so that the Load or Init or New functions aren't cluttered and confusing. If you can separate it into a few tasks like others have suggested, then you can name it something useful and help organize. Even if you are calling it just once, it's not a bad habit, and often you find that there are other times when you may want to re-init things and can use that function again.
Just thought I'd throw this out there, since it hasn't been mentioned yet - the Facade Pattern is sometimes cited as an interface to a complex subsystem. I haven't done much with it myself, but the metaphors are usually something like turning on a computer (requires several steps), or turning on a home theater system (turn on TV, turn on receiver, turn down lights, etc...)
Depending on the code structure, might be something worth considering to abstract away your large initialization functions. I still agree with meagar's point though that breaking down functions into _init_X(), _init_Y(), etc. is a good way to go. Even if you aren't going to reuse comments in this code, on your next project, when you say to yourself, "How did I initialize that X-component?", it'll be much easier to go back and pick it out of the smaller _init_X() function than it would be to pick it out of a larger function, especially if the X-initialization is scattered throughout it.
Function length is, as you tagged, a very subjective matter. However, a standard best-practice is to isolate code that is often repeated and/or can function as its own entity. For instance, if your initialization function is loading library files or objects that will be used by a specific library, that block of code should be modularized.
With that said, it's not bad to have an initialization method that's long, as long as it's not long because of lots of repeated code or other snippets that can be abstracted away.
Hope that helps,
Carlos Nunez
Related
I have been reading over design-by-contract posts and examples, and there is something that I cannot seem to wrap my head around. In all of the examples I have seen, DbC is used on a trivial class testing its own state in the post-conditions (e.g. lots of Bank Accounts).
It seems to me that most of the time when you call a method of a class, it does much more work delegating method calls to its external dependencies. I understand how to check for this in a Unit-Test with specific scenarios using dependency inversion and mock objects that focus on the external behavior of the method, but how does this work with DbC and post-conditions?
My second question has to deal with understanding complex post-conditions. It seems to me that to write out a post-condition for many functions, that you basically have to re-write the body of the function for your post-condition to know what the new state is going to be. What is the point of that?
I really do like the notion of DbC and I think that it has great promise, particularly if I can figure out how to reproduce some failure state once I find a validated contract. Over the past couple of hours I have been reading some neat stuff wrt. automatic test generation in Eiffel. I am currently trying to improve my processes in C++ development, but I am open to learning something new if I can figure out how to not lose all of the ground I have made in my current projects. Thanks.
but how does this work with DbC and
post-conditions?
Every function is basically one of these:
A sequence of statements
A conditional statement
A loop
The idea is that you should check any postconditions about the results of the function that go beyond the union of the postconditions of all the functions called.
that you basically have to re-write
the body of the function for your
post-condition to know what the new
state is going to be
Think about it the other way round. What made you write the function in the first place? What were you pursuing? Can that be expressed in a postcondition which is more simple than the function body itself? A postcondition will typically use queries (what in C++ are const functions), while the body usually combines commands and queries (methods that modify the object and methods which only get information from it).
In some cases, yes, you will find out that you can really add little value with postconditions. In these cases, writing a bunch of tests will typically be enough.
See also:
Bertrand Meyer, Contract Driven
Development
Related questions 1, 2
Delegation at the contract level
most of the time when you call a
method of a class, it does much more
work delegating method calls to its
external dependencies
As for this first question: the implementation of a function/method may call many other function/methods, but if the designer of the code had a clear mind, this does not imply that the specification of the caller is the concatenation of the specifications of the callees. For a method that calls many others, the size of the specification can remain contained if the method accomplishes a precise and well-defined task. Which it should if the whole system was well designed.
You are clearly asking your question from the point of view of run-time assertion checking. In this context, the above would perhaps be expressed as "you don't need to re-check in the post-condition of the caller that all the callees have respected their respective contracts. These checks will already be made on each call. In the post-condition of the caller, only check the functionally visible result of the caller."
Understanding complex post-conditions
You may find this "ACSL by example" document interesting (although probably different from what you're used to). It contains many examples of formal contracts for C functions. The language of the contracts is intended for static verification instead of run-time checking, with all the advantages and the drawbacks that it entails. They are a little more sophisticated than the "Bank Accounts" that you mention — these functions implement real algorithms, although simple ones. The document keeps the contracts short and readable by introducing well-thought-out auxiliary predicates (which would be called queries in Eiffel, as Daniel points out in his answer).
I often hear around here from test driven development people that having a function get large amounts of information implicitly is a bad thing. I can see were this would be bad from a testing perspective, but isn't it sometimes necessary from an encapsulation perspective? The following question comes to mind:
Is using Random and OrderBy a good shuffle algorithm?
Basically, someone wanted to create a function in C# to randomly shuffle an array. Several people told him that the random number generator should be passed in as a parameter. This seems like an egregious violation of encapsulation to me, even if it does make testing easier. Isn't the fact that an array shuffling algorithm requires any state at all other than the array it's shuffling an implementation detail that the caller should not have to care about? Wouldn't the correct place to get this information be implicitly, possibly from a thread-local singleton?
I don't think it breaks encapsulation. The only state in the array is the data itself - and "a source of randomness" is essentially a service. Why should an array naturally have an associated source of randomness? Why should that have to be a singleton? What about different situations which have different requirements - e.g. speed vs cryptographically secure randomness? There's a reason why java.util.Random has a SecureRandom subclass :) Perhaps it doesn't matter whether the shuffle's results are predictable with a lot of effort and observation - or perhaps it does. That will depend on the context, and that's information that the shuffle algorithm shouldn't care about.
Once you start thinking of it as a service, it makes sense that it's passed in as a dependency.
Yes, you could get it from a thread-local singleton (and indeed I'm going to blog about exactly that in the next few days) but I would generally code it so that the caller gets to make that decision.
One benefit of the "randomness as a service" concept is that it makes for repeatability - if you've got a test which fails, you can pass in a Random with a specific seed and know you'll always get the same results, which makes debugging easier.
Of course, there's always the option of making the Random optional - use a thread-local singleton as a default if the caller doesn't provide their own.
Yes, that does break encapsulation. As with most software design decisions, this is a trade-off between two opposing forces. If you encapsulate the RNG then you make it difficult to change for a unit test. If you make it a parameter then you make it easy for a user to change the RNG (and potentially get it wrong).
My personal preference is to make it easy to test, then provide a default implementation (a default constructor that creates its own RNG, in this particular case) and good documentation for the end user. Adding a method with the signature
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source)
that creates a Random using the current system time as its seed would take care of most normal use cases of this method. The original method
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source, Random rng)
could be used for testing (pass in a Random object with a known seed) and also in those rare cases where a user decides they need a cryptographically secure RNG. The one-parameter implementation should call this method.
I don't think this violates encapsulation.
Your Example
I would say that being able to provide an RNG is a feature of the class. I would obviously provide a method that doesn't require it, but I can see times where it may be useful to be able to duplicate the randomization.
What if the array shuffler was part of a game that used the RNG for level generation. If a user wanted to save the level and play it again later, it may be more efficient to store the RNG seed.
General Case
Simple classes that have a single task like this typically don't need to worry about divulging their inner workings. What they encapsulate is the logic of the task, not the elements required by that logic.
This has been asked before (question no. 308581), but that particular question and the answers are a bit C++ specific and a lot of things there are not really relevant in languages like Java or C#.
The thing is, that even after refactorization, I find that there is a bit of mess in my source code files. I mean, the function bodies are alright, but I'm not quite happy with the way the functions themselves are ordered. Of course, in an IDE like Visual Studio it is relatively easy to find a member if you remember how it is called, but this is not always the case.
I've tried a couple of approaches like putting public methods first but that the drawback of this approach is that a function at the top of the file ends up calling an other private function at the bottom of the file so I end up scrolling all the time.
Another approach is to try to group related methods together (maybe into regions) but obviously this has its limits as if there are many non-related methods in the same class then maybe it's time to break up the class to two or more smaller classes.
So consider this: your code has been refactored properly so that it satisfies all the requirements mentioned in Code Complete, but you would still like to reorder your methods for ergonomic purposes. What's your approach?
(Actually, while not exactly a technical problem, this is problem really annoys the hell out of me so I would be really grateful if someone could come up with a good approach)
Actually I totally rely on the navigation functionality of my IDE, i.e. Visual Studio. Most of the time I use F12 to jump to the declaration (or Shift-F12 to find all references) and the Ctrl+- to jump back.
The reason for that is that most of the time I am working on code that I haven't written myself and I don't want to spend my time re-ordering methods and fields.
P.S.: And I also use RockScroll, a VS add-in which makes navigating and scrolling large files quite easy
If you're really having problems scrolling and finding, it's possible you're suffering from god class syndrome.
Fwiw, I personally tend to go with:
class
{
#statics (if any)
#constructor
#destructor (if any)
#member variables
#properties (if any)
#public methods (overrides, etc, first then extensions)
#private (aka helper) methods (if any)
}
And I have no aversion to region blocks, nor comments, so make free use of both to denote relationships.
From my (Java) point of view I would say constructors, public methods, private methods, in that order. I always try to group methods implementing a certain interface together.
My favorite weapon of choice is IntelliJ IDEA, which has some nice possibilities to fold methods bodies so it is quite easy to display two methods directly above each other even when their actual position in the source file is 700 lines apart.
I would be careful with monkeying around with the position of methods in the actual source. Your IDE should give you the ability to view the source in the way you want. This is especially relevant when working on a project where developers can use their IDE of choice.
My order, here it comes.
I usually put statics first.
Next come member variables and properties, a property that accesses one specific member is grouped together with this member. I try to group related information together, for example all strings that contain path information.
Third is the constructor (or constructors if you have several).
After that follow the methods. Those are ordered by whatever appears logical for that specific class. I often group methods by their access level: private, protected, public. But I recently had a class that needed to override a lot of methods from its base class. Since I was doing a lot of work there, I put them together in one group, regardless of their access level.
My recommendation: Order your classes so that it helps your workflow. Do not simply order them, just to have order. The time spent on ordering should be an investment that helps you save more time that you would otherwise need to scroll up and down.
In C# I use #region to seperate those groups from each other, but that is a matter of taste. There are a lot of people who don't like regions. I do.
I place the most recent method I just created on top of the class. That way when I open the project, I'm back at the last method I'm developing. Easier for me to get back "in the zone."
It also reflected the fact that the method(which uses other methods) I just created is the topmost layer of other methods.
Group related functions together, don't be hard-pressed to put all private functions at the bottom. Likewise, imitate the design rationale of C#'s properties, related functions should be in close proximity to each other, the C# language construct for properties reinforces that idea.
P.S.
If only C# can nest functions like Pascal or Delphi. Maybe Anders Hejlsberg can put it in C#, he also invented Turbo Pascal and Delphi :-) D language has nested functions.
A few years ago I spent far too much time pondering this question, and came up with a horrendously complex system for ordering the declarations within a class. The order would depend on the access specifier, whether a method or field was static, transient, volatile etc.
It wasn't worth it. IMHO you get no real benefit from such a complex arrangement.
What I do nowadays is much simpler:
Constructors (default constructor first, otherwise order doesn't matter.)
Methods, sorted by name (static vs. non-static doesn't matter, nor abstract vs. concrete, virtual vs. final etc.)
Inner classes, sorted by name (interface vs. class etc. doesn't matter)
Fields, sorted by name (static vs. non-static doesn't matter.) Optionally constants (public static final) first, but this is not essential.
i pretty sure there was a visual studio addin that could re-order the class members in the code.
so i.e. ctors on the top of the class then static methods then instance methods...
something like that
unfortunately i can't remember the name of this addin! i also think that this addin was for free!
maybe someone other can help us out?
My personal take for structuring a class is as follows:
I'm strict with
constants and static fields first, in alpha order
non-private inner classes and enums in alpha order
fields (and attributes where applicable), in alpha order
ctors (and dtors where applicable)
static methods and factory methods
methods below, in alpha order, regardless of visibility.
I use the auto-formatting capabilities of an IDE at all times. So I'm constantly hitting Ctrl+Shift+F when I'm working. I export auto-formatting capabilities in an xml file which I carry with me everywhere.
It helps down the lane when doing merges and rebases. And it is the type of thing you can automate in your IDE or build process so that you don't have to make a brain cell sweat for it.
I'm not claiming MY WAY is the way. But pick something, configure it, use it consistently until it becomes a reflex, and thus forget about it.
What factors determine which approach is more appropriate?
I think both have their places.
You shouldn't simply use DoSomethingToThing(Thing n) just because you think "Functional programming is good". Likewise you shouldn't simply use Thing.DoSomething() because "Object Oriented programming is good".
I think it comes down to what you are trying to convey. Stop thinking about your code as a series of instructions, and start thinking about it like a paragraph or sentence of a story. Think about which parts are the most important from the point of view of the task at hand.
For example, if the part of the 'sentence' you would like to stress is the object, you should use the OO style.
Example:
fileHandle.close();
Most of the time when you're passing around file handles, the main thing you are thinking about is keeping track of the file it represents.
CounterExample:
string x = "Hello World";
submitHttpRequest( x );
In this case submitting the HTTP request is far more important than the string which is the body, so submitHttpRequst(x) is preferable to x.submitViaHttp()
Needless to say, these are not mutually exclusive. You'll probably actually have
networkConnection.submitHttpRequest(x)
in which you mix them both. The important thing is that you think about what parts are emphasized, and what you will be conveying to the future reader of the code.
To be object-oriented, tell, don't ask : http://www.pragmaticprogrammer.com/articles/tell-dont-ask.
So, Thing.DoSomething() rather than DoSomethingToThing(Thing n).
If you're dealing with internal state of a thing, Thing.DoSomething() makes more sense, because even if you change the internal representation of Thing, or how it works, the code talking to it doesn't have to change. If you're dealing with a collection of Things, or writing some utility methods, procedural-style DoSomethingToThing() might make more sense or be more straight-forward; but still, can usually be represented as a method on the object representing that collection: for instance
GetTotalPriceofThings();
vs
Cart.getTotal();
It really depends on how object oriented your code is.
Thing.DoSomething is appropriate if Thing is the subject of your sentence.
DoSomethingToThing(Thing n) is appropriate if Thing is the object of your sentence.
ThingA.DoSomethingToThingB(ThingB m) is an unavoidable combination, since in all the languages I can think of, functions belong to one class and are not mutually owned. But this makes sense because you can have a subject and an object.
Active voice is more straightforward than passive voice, so make sure your sentence has a subject that isn't just "the computer". This means, use form 1 and form 3 frequently, and use form 2 rarely.
For clarity:
// Form 1: "File handle, close."
fileHandle.close();
// Form 2: "(Computer,) close the file handle."
close(fileHandle);
// Form 3: "File handle, write the contents of another file handle."
fileHandle.writeContentsOf(anotherFileHandle);
I agree with Orion, but I'm going to rephrase the decision process.
You have a noun and a verb / an object and an action.
If many objects of this type will use this action, try to make the action part of the object.
Otherwise, try to group the action separately, but with related actions.
I like the File / string examples. There are many string operations, such as "SendAsHTTPReply", which won't happen for your average string, but do happen often in a certain setting. However, you basically will always close a File (hopefully), so it makes perfect sense to put the Close action in the class interface.
Another way to think of this is as buying part of an entertainment system. It makes sense to bundle a TV remote with a TV, because you always use them together. But it would be strange to bundle a power cable for a specific VCR with a TV, since many customers will never use this. The key idea is how often will this action be used on this object?
Not nearly enough information here. It depends if your language even supports the construct "Thing.something" or equivalent (ie. it's an OO language). If so, it's far more appropriate because that's the OO paradigm (members should be associated with the object they act on). In a procedural style, of course, DoSomethingtoThing() is your only choice... or ThingDoSomething()
DoSomethingToThing(Thing n) would be more of a functional approach whereas Thing.DoSomething() would be more of an object oriented approach.
That is the Object Oriented versus Procedural Programming choice :)
I think the well documented OO advantages apply to the Thing.DoSomething()
This has been asked Design question: does the Phone dial the PhoneNumber, or does the PhoneNumber dial itself on the Phone?
Here are a couple of factors to consider:
Can you modify or extend the Thing class. If not, use the former
Can Thing be instantiated. If not, use the later as a static method
If Thing actually get modified (i.e. has properties that change), prefer the latter. If Thing is not modified the latter is just as acceptable.
Otherwise, as objects are meant to map on to real world object, choose the method that seems more grounded in reality.
Even if you aren't working in an OO language, where you would have Thing.DoSomething(), for the overall readability of your code, having a set of functions like:
ThingDoSomething()
ThingDoAnotherTask()
ThingWeDoSomethingElse()
then
AnotherThingDoSomething()
and so on is far better.
All the code that works on "Thing" is on the one location. Of course, the "DoSomething" and other tasks should be named consistently - so you have a ThingOneRead(), a ThingTwoRead()... by now you should get point. When you go back to work on the code in twelve months time, you will appreciate taking the time to make things logical.
In general, if "something" is an action that "thing" naturally knows how to do, then you should use thing.doSomething(). That's good OO encapsulation, because otherwise DoSomethingToThing(thing) would have to access potential internal information of "thing".
For example invoice.getTotal()
If "something" is not naturally part of "thing's" domain model, then one option is to use a helper method.
For example: Logger.log(invoice)
If DoingSomething to an object is likely to produce a different result in another scenario, then i'd suggest you oneThing.DoSomethingToThing(anotherThing).
For example you may have two was of saving thing in you program so you might adopt a DatabaseObject.Save(thing) SessionObject.Save(thing) would be more advantageous than thing.Save() or thing.SaveToDatabase or thing.SaveToSession().
I rarely pass no parameters to a class, unless I'm retrieving public properties.
To add to Aeon's answer, it depends on the the thing and what you want to do to it. So if you are writing Thing, and DoSomething alters the internal state of Thing, then the best approach is Thing.DoSomething. However, if the action does more than change the internal state, then DoSomething(Thing) makes more sense. For example:
Collection.Add(Thing)
is better than
Thing.AddSelfToCollection(Collection)
And if you didn't write Thing, and cannot create a derived class, then you have no chocie but to do DoSomething(Thing)
Even in object oriented programming it might be useful to use a function call instead of a method (or for that matter calling a method of an object other than the one we call it on). Imagine a simple database persistence framework where you'd like to just call save() on an object. Instead of including an SQL statement in every class you'd like to have saved, thus complicating code, spreading SQL all across the code and making changing the storage engine a PITA, you could create an Interface defining save(Class1), save(Class2) etc. and its implementation. Then you'd actually be calling databaseSaver.save(class1) and have everything in one place.
I have to agree with Kevin Conner
Also keep in mind the caller of either of the 2 forms. The caller is probably a method of some other object that definitely does something to your Thing :)
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Singletons are a hotly debated design pattern, so I am interested in what the Stack Overflow community thought about them.
Please provide reasons for your opinions, not just "Singletons are for lazy programmers!"
Here is a fairly good article on the issue, although it is against the use of Singletons:
scientificninja.com: performant-singletons.
Does anyone have any other good articles on them? Maybe in support of Singletons?
In defense of singletons:
They are not as bad as globals because globals have no standard-enforced initialization order, and you could easily see nondeterministic bugs due to naive or unexpected dependency orders. Singletons (assuming they're allocated on the heap) are created after all globals, and in a very predictable place in the code.
They're very useful for resource-lazy / -caching systems such as an interface to a slow I/O device. If you intelligently build a singleton interface to a slow device, and no one ever calls it, you won't waste any time. If another piece of code calls it from multiple places, your singleton can optimize caching for both simultaneously, and avoid any double look-ups. You can also easily avoid any deadlock condition on the singleton-controlled resource.
Against singletons:
In C++, there's no nice way to auto-clean-up after singletons. There are work-arounds, and slightly hacky ways to do it, but there's just no simple, universal way to make sure your singleton's destructor is always called. This isn't so terrible memory-wise -- just think of it as more global variables, for this purpose. But it can be bad if your singleton allocates other resources (e.g. locks some files) and doesn't release them.
My own opinion:
I use singletons, but avoid them if there's a reasonable alternative. This has worked well for me so far, and I have found them to be testable, although slightly more work to test.
Google has a Singleton Detector for Java that I believe started out as a tool that must be run on all code produced at Google. The nutshell reason to remove Singletons:
because they can make testing
difficult and hide problems with your
design
For a more explicit explanation see 'Why Singletons Are Controversial' from Google.
A singleton is just a bunch of global variables in a fancy dress.
Global variables have their uses, as do singletons, but if you think you're doing something cool and useful with a singleton instead of using a yucky global variable (everyone knows globals are bad mmkay), you're unfortunately misled.
The purpose of a Singleton is to ensure a class has only one instance, and provide a global point of access to it. Most of the time the focus is on the single instance point. Imagine if it were called a Globalton. It would sound less appealing as this emphasizes the (usually) negative connotations of a global variable.
Most of the good arguments against singletons have to do with the difficulty they present in testing as creating test doubles for them is not easy.
There's three pretty good blog posts about Singletons by Miško Hevery in the Google Testing blog.
Singletons are Pathological Liars
Where Have All the Singletons Gone?
Root Cause of Singletons
Singleton is not a horrible pattern, although it is misused a lot. I think this misuse is because it is one of the easier patterns and most new to the singleton are attracted to the global side effect.
Erich Gamma had said the singleton is a pattern he wishes wasn't included in the GOF book and it's a bad design. I tend to disagree.
If the pattern is used in order to create a single instance of an object at any given time then the pattern is being used correctly. If the singleton is used in order to give a global effect, it is being used incorrectly.
Disadvantages:
You are coupling to one class throughout the code that calls the singleton
Creates a hassle with unit testing because it is difficult to replace the instance with a mock object
If the code needs to be refactored later on because of the need for more than one instance, it is more painful than if the singleton class were passed into the object (using an interface) that uses it
Advantages:
One instance of a class is represented at any given point in time.
By design you are enforcing this
Instance is created when it is needed
Global access is a side effect
Chicks dig me because I rarely use singleton and when I do it's typically something unusual. No, seriously, I love the singleton pattern. You know why? Because:
I'm lazy.
Nothing can go wrong.
Sure, the "experts" will throw around a bunch of talk about "unit testing" and "dependency injection" but that's all a load of dingo's kidneys. You say the singleton is hard to unit test? No problem! Just declare everything public and turn your class into a fun house of global goodness. You remember the show Highlander from the 1990's? The singleton is kind of like that because: A. It can never die; and B. There can be only one. So stop listening to all those DI weenies and implement your singleton with abandon. Here are some more good reasons...
Everybody is doing it.
The singleton pattern makes you invincible.
Singleton rhymes with "win" (or "fun" depending on your accent).
I think there is a great misunderstanding about the use of the Singleton pattern. Most of the comments here refer to it as a place to access global data. We need to be careful here - Singleton as a pattern is not for accessing globals.
Singleton should be used to have only one instance of the given class. Pattern Repository has great information on Singleton.
One of the colleagues I have worked with was very Singleton-minded. Whenever there was something that was kind of a manager or boss like object he would make that into a singleton, because he figured that there should be only one boss. And each time the system took up some new requirements, it turned out there were perfectly valid reasons to allow multiple instances.
I would say that singleton should be used if the domain model dictates (not 'suggests') that there is one. All other cases are just accendentally single instances of a class.
I've been trying to think of a way to come to the poor singelton's rescue here, but I must admit it's hard. I've seen very few legitimate uses of them and with the current drive to do dependency injection andd unit testing they are just hard to use. They definetly are the "cargo cult" manifestation of programming with design patterns I have worked with many programmers that have never cracked the "GoF" book but they know 'Singelton' and thus they know 'Patterns'.
I do have to disagree with Orion though, most of the time I've seen singeltons oversused it's not global variables in a dress, but more like global services(methods) in a dress. It's interesting to note that if you try to use Singeltons in the SQL Server 2005 in safe mode through the CLR interface the system will flag the code. The problem is that you have persistent data beyond any given transaction that may run, of course if you make the instance variable read only you can get around the issue.
That issue lead to a lot of rework for me one year.
Holy wars! Ok let me see.. Last time I checked the design police said..
Singletons are bad because they hinder auto testing - instances cannot be created afresh for each test case.
Instead the logic should be in a class (A) that can be easily instantiated and tested. Another class (B) should be responsible for constraining creation. Single Responsibility Principle to the fore! It should be team-knowledge that you're supposed to go via B to access A - sort of a team convention.
I concur mostly..
Many applications require that there is only one instance of some class, so the pattern of having only one instance of a class is useful. But there are variations to how the pattern is implemented.
There is the static singleton, in which the class forces that there can only be one instance of the class per process (in Java actually one per ClassLoader). Another option is to create only one instance.
Static singletons are evil - one sort of global variables. They make testing harder, because it's not possible to execute the tests in full isolation. You need complicated setup and tear down code for cleaning the system between every test, and it's very easy to forget to clean some global state properly, which in turn may result in unspecified behaviour in tests.
Creating only one instance is good. You just create one instance when the programs starts, and then pass the pointer to that instance to all other objects which need it. Dependency injection frameworks make this easy - you just configure the scope of the object, and the DI framework will take care of creating the instance and passing it to all who need it. For example in Guice you would annotate the class with #Singleton, and the DI framework will create only one instance of the class (per application - you can have multiple applications running in the same JVM). This makes testing easy, because you can create a new instance of the class for each test, and let the garbage collector destroy the instance when it is no more used. No global state will leak from one test to another.
For more information:
The Clean Code Talks - "Global State and Singletons"
Singleton as an implementation detail is fine. Singleton as an interface or as an access mechanism is a giant PITA.
A static method that takes no parameters returning an instance of an object is only slightly different from just using a global variable. If instead an object has a reference to the singleton object passed in, either via constructor or other method, then it doesn't matter how the singleton is actually created and the whole pattern turns out not to matter.
It was not just a bunch of variables in a fancy dress because this was had dozens of responsibilities, like communicating with persistence layer to save/retrieve data about the company, deal with employees and prices collections, etc.
I must say you're not really describing somthing that should be a single object and it's debatable that any of them, other than Data Serialization should have been a singelton.
I can see at least 3 sets of classes that I would normally design in, but I tend to favor smaller simpler objects that do a narrow set of tasks very well. I know that this is not the nature of most programmers. (Yes I work on 5000 line class monstrosities every day, and I have a special love for the 1200 line methods some people write.)
I think the point is that in most cases you don't need a singelton and often your just making your life harder.
The biggest problem with singletons is that they make unit testing hard, particularly when you want to run your tests in parallel but independently.
The second is that people often believe that lazy initialisation with double-checked locking is a good way to implement them.
Finally, unless your singletons are immutable, then they can easily become a performance problem when you try and scale your application up to run in multiple threads on multiple processors. Contended synchronization is expensive in most environments.
Singletons have their uses, but one must be careful in using and exposing them, because they are way too easy to abuse, difficult to truly unit test, and it is easy to create circular dependencies based on two singletons that accesses each other.
It is helpful however, for when you want to be sure that all your data is synchronized across multiple instances, e.g., configurations for a distributed application, for instance, may rely on singletons to make sure that all connections use the same up-to-date set of data.
I find you have to be very careful about why you're deciding to use a singleton. As others have mentioned, it's essentially the same issue as using global variables. You must be very cautious and consider what you could be doing by using one.
It's very rare to use them and usually there is a better way to do things. I've run into situations where I've done something with a singleton and then had to sift through my code to take it out after I discovered how much worse it made things (or after I came up with a much better, more sane solution)
I've used singletons a bunch of times in conjunction with Spring and didn't consider it a crutch or lazy.
What this pattern allowed me to do was create a single class for a bunch of configuration-type values and then share the single (non-mutable) instance of that specific configuration instance between several users of my web application.
In my case, the singleton contained client configuration criteria - css file location, db connection criteria, feature sets, etc. - specific for that client. These classes were instantiated and accessed through Spring and shared by users with the same configuration (i.e. 2 users from the same company). * **I know there's a name for this type of application but it's escaping me*
I feel it would've been wasteful to create (then garbage collect) new instances of these "constant" objects for each user of the app.
I'm reading a lot about "Singleton", its problems, when to use it, etc., and these are my conclusions until now:
Confusion between the classic implementation of Singleton and the real requirement: TO HAVE JUST ONE INSTANCE OF a CLASS!
It's generally bad implemented. If you want a unique instance, don't use the (anti)pattern of using a static GetInstance() method returning a static object. This makes a class to be responsible for instantiating a single instance of itself and also perform logic. This breaks the Single Responsibility Principle. Instead, this should be implemented by a factory class with the responsibility of ensuring that only one instance exists.
It's used in constructors, because it's easy to use and must not be passed as a parameter. This should be resolved using dependency injection, that is a great pattern to achieve a good and testable object model.
Not TDD. If you do TDD, dependencies are extracted from the implementation because you want your tests to be easy to write. This makes your object model to be better. If you use TDD, you won't write a static GetInstance =). BTW, if you think in objects with clear responsibilities instead classes, you'll get the same effect =).
I really disagree on the bunch of global variables in a fancy dress idea. Singletons are really useful when used to solve the right problem. Let me give you a real example.
I once developed a small piece of software to a place I worked, and some forms had to use some info about the company, its employees, services and prices. At its first version, the system kept loading that data from the database every time a form was opened. Of course, I soon realized this approach was not the best one.
Then I created a singleton class, named company, which encapsulated everything about the place, and it was completely filled with data by the time the system was opened.
It was not just a bunch of variables in a fancy dress because this was had dozens of responsibilities, like communicating with persistence layer to save/retrieve data about the company, deal with employees and prices collections, etc.
Plus, it was a fixed, system-wide, easily accessible point to have the company data.
Singletons are very useful, and using them is not in and of itself an anti-pattern. However, they've gotten a bad reputation largely because they force any consuming code to acknowledge that they are a singleton in order to interact with them. That means if you ever need to "un-Singletonize" them, the impact on your codebase can be very significant.
Instead, I'd suggest either hiding the Singleton behind a factory. That way, if you need to alter the service's instantiation behavior in the future, you can just change the factory rather than all types that consume the Singleton.
Even better, use an inversion of control container! Most of them allow you to separate instantiation behavior from the implementation of your classes.
One scary thing on singletons in for instance Java is that you can end up with multiple instances of the same singleton in some cases. The JVM uniquely identifies based on two elements: A class' fully qualified name, and the classloader responsible for loading it.
That means the same class can be loaded by two classloaders unaware of each other, and different parts of your application would have different instances of this singleton that they interact with.
Write normal, testable, injectable objects and let Guice/Spring/whatever handle the instantiation. Seriously.
This applies even in the case of caches or whatever the natural use cases for singletons are. There's no need to repeat the horror of writing code to try to enforce one instance. Let your dependency injection framework handle it. (I recommend Guice for a lightweight DI container if you're not already using one).