I have noticed in a lot of code lately that people put hard coded configuration (like port numbers, etc.) values deep inside of classes/methods, making it difficult to find, and also not configurable.
Is this a violation of the SOLID principles? If not, is there another "principle" that I can cite to my team members about why it's not a good idea? I don't want to just say "it's bad because I don't like it" but I am having trouble thinking of a good argument.
A good argument against hardcoding a TCP port number in a class would be 'Context independence' violation. From GOOS, with my emphasis:
Context Independence
... the
"context independence" rule helps us decide whether an object hides
too much or hides the wrong information. A system is easier to change
if its objects are context-independent; that is, if each object has no
built-in knowledge about the system in which it executes. This allows
us to take units of behavior (objects) and apply them in new
situations. To be context-independent, whatever an object needs to
know about the larger environment it’s running in must be passed in.
In this specific case of Context Independence I would call it 'Environment Independence'. In other words a class with hardcoded port number has inappropriate dependency on a runtime OS environment, essentially stating 'I know that port 7778 will always be available' which is clearly wrong.
The SOLID principles cover class design.
I suspect the idea that you should store configuration in configuration files isn't normally regarded as controversial enough to warrant inventing a special principle to persuade people! :)
Most people just figure it out from experience, the first time they try get the software running anywhere other than their own development workstation.
While not strictly SOLID, another principle of OOD is the The Common Closure Principle, which states that classes that change together are packaged together. While not exactly a class, you could stretch this idea to configuration information. Since e.g. port numbers change based on different criteria than the surrounding code, it seems to violates this.
The Single Responsibility Principle (the S in SOLID) states that a class should only have one reason to change. This article gives an example of a Modem interface, and discusses how the details of how to connect and hang up are a separate responsibility from the communication of data, and will probably change for different reasons. You could use this to make a similar case for why port numbers are an extra "reason for change", separate from the class's main responsibility.
Related
In brushing up on the SRP I read this document which I located via Uncle Bob's page on principles of OOD. I find the following passage puzzling and somewhat at odds with the rest of the document:
"If, on the other hand, the application is not changing in ways that cause the the two responsibilities to change at different times, then there is no need to separate them. Indeed, separating them would smell of Needless Complexity. There is a corollary here. An axis of change is only an axis of change if the changes actually occur. It is not wise to apply the SRP, or any other principle for that matter, if there is no symptom."
While I understand the answer to many software development questions is "it depends" principles like the SRP appear to be almost universally beneficial and to be implemented as a matter of course. The SRP itself affords code a high adaptability to future changes in requirements. Isn't the point to separate out responsibilities from the get-go to avoid struggling with highly coupled code and cascading changes later on?
I would really appreciate some clarification on this to make sure my understanding of this core principle is correct. Thanks in advance!
From my humble understanding, in the Modem example that is presented here, it might be possible that responsibilities of the modem (Connection and Data Exchange) will change as one.
You have two possibilities here :
When the protocol changes, it is possible that only the connection part change, or only the data exchange part change. It this case you should have two interfaces, because a change of protocol in the data does not imply a change of protocol in the connection.
When the protocol changes, it will always change the connection part and the data exchange part. In that case, you don't need two interfaces, because everytime you will have to rewrite the connection part, you are sure that the data exchange will change as well. In that case, you have two responsibilities put on the same change Axis (which is the protocol handled by the modem), so you can leave them inside a single interface.
The key to this statement is "not changing in ways that cause the two responsibilities to change at different times". Let's say for the sake of argument you have a PaymentLogger and a Payment class. Every time you create a new PaymentType (CreditCard, Cash, Paypal, etc) you need to update the PaymentLogger to log actions specific to those Payments. Instead of splitting out a PaymentLogger class you could have Payment class have a method called Log which does whatever is specific for itself.
In this case it could be that the act of recording actions should be build into the class itself since creating a new Payment requires also creating a new PaymentLogger. It's a responsibility that should have been part of Payment all along.
So when original code was written there was only a need for say LabTest class. But now say we have new requirements to add say RadiologyTest, EKGTest etc.
These classes have a lot in common hence it makes sense to have a base class.
But that will mean LabTest class will have to be modified, lets say its interface will remain same as before, in other words consumers of LabTest class will not need change.
Is this violation of open closed principle principle? (LabTest is being modified).
I think you can look at it from two perspectives: existing requirements and new requirements.
If the existing requirements didn't cover the need for these kinds of changes then I'd say, based on those requirements, LabTest did not violate OCP.
With the new requirements, you need to add functionality that does not fit with the LabTest implementation. Adding it to OCP would violate SRP. The requirements now create a new change vector that will force you to refactor LabTest to keep it OCP. If you fail to refactor LabTest it will violate SRP and OCP. When you refactor, keep in mind the new change vector in any classes you create or modify.
These classes have a lot in common hence it makes sense to have a base class.
I think you may be violating SRP. After all, if each class does one task, how can two or more be so similar? If there's a task they both do identically, then that is a separate task and should be done by another class.
So I would say, first refactor LabTest into it's constituent parts (hope you've got unit tests!). Then when you come to write RadiologyTest, EKGTest they can reuse the parts that make sense for them. This is also known as composition over inheritance.
But whatever you do, do use interfaces to these classes in the client. Don't force those who follow to use your base classes to add extensions.
I may get burnt for this answer, but going on a limb anyways.
In my opinion(IMO), OCP cannot be followed in the purist sense like other principles such as SRP, DIP or ISP.
If requirements change in such a way that you have to change the responsibility of a class to be true to their representation of the domain model, then we have to change that class.
IMO, OCP stops us from re-factoring code to follow the evolution of the system.
Please correct me if I am wrong.
Update:
After further research, this is what I am thinking:
Lets say, I have automated test both on unit level and integration level, then IMO we should redesign the complete system to fit the new model, OCP is out the door here.
IMO, the goal of a system evolution is always to avoid hacks(not changing LabTest class and the corresponding DB table so as to not break old code[not violate OCP], and using LabTest to store EKGTest's common data and using LabTest inside of EKGTest or EKGTest inheriting from LabTest will be a hack, IMO) will be and to make the system represent its model as accurate as possible.
I think the Open-Closed Principle, (as outlined by Uncle Bob anyway, vs Bertrand Meyers) isn't about never modifying classes (if software was never going to change it might as well be hardware).
& in your own case, I don't think you're violating OCP as you've mentioned all of your uses of your class are depending on the abstraction of LabTest rather than the implementation of RadiologyTest.
From Uncle Bob's introductory paper, he has an example of a DrawAllShapes class that if designed to OCP, shouldn't need to change each time a new subclass of Shape is added to the system. Regarding to what level you apply it, Uncle Bob says that —
It should be clear that no significant program can be 100% closed. For
example, consider what would happen to the DrawAllShapes function from
Listing 2 if we decided that all Circles should be drawn before any
Squares. The DrawAllShapes function is not closed against a change
like this. In general, no matter how “closed” a module is, there will
always be some kind of change against which it is not closed.
Since closure cannot be complete, it must be strategic. That is, the
designer must choose the kinds of changes against which to close his
design.
I wouldn't read "closed for modification" as "don't refactor", more that you should design you classes in such a way that other classes can't make modifications which will affect you — e.g. applying the basic OO stuff — encapsulation via getters/setters & private member variables.
I can grasp the part "do one thing" via encapsulation, Dependency Injection, Principle of Least Knowledge, and You Ain't Gonna Need It; but how do I understand the second part "do it well?"
An example given was the notion of completeness, given in the same YAGNI article:
for example, among features which allow adding items, deleting items, or modifying items, completeness could be used to also recommend "renaming items".
However, I found reasoning like that could easily be abused into feature creep, thus violating the "do one thing" part.
So, what is a litmus test for seeing rather a feature belongs to the "do it well" category (hence, include it into the function/class/program) or to the other "do one thing" category (hence, exclude it)?
The first part, "do one thing," is best understood via UNIX's ls command as a counterexample for its inclusion of excessive number of flags for formatting its output, which should have been completely delegated to another external program. But I don't have a good example to see the second part "do it well."
What is a good example where removing any further feature would make it not "do it well?"
I see "Do It Well" as being as much about quality of implementation of a function than about the completeness of a set functions (in your example having rename, as well as create and delete).
Do It Well manifests in many ways, some ways of thinking:
Behaviour in response to "special" inputs. Example, calculating the mean of some integers:
int mean(int[] values) { ... }
what does this do if the array has zero elements? If the items total more than MAX_INT?
Performance Characteristics. Has sufficient attention been given to behaviour as the data volumes increase?
Dependency Failures. If our implementation depends upon other modules or infrastructure what happens when these fail. Example: File System Full, Database Down?
Concerning feature creep itself, I think you're correct to indentify a tension here. One thing you might consider: you don't need to implment every feature providing that it's pretty obvious that a feature can be added easily without a complete rewrite.
The whole purpose of this advice is to make you favor quality over quantity.
The concept of one thing is subjective and depends on granularity. Would you say that a spreadsheet application does more than one thing if it can also print, or is that part of that one thing?
The point is that you should make sure that any feature, and the application itself, is done and will delight customers before you scramble to add new features.
I think your question points out the fundamentally organic nature of feature creep, and in understanding that nature, you will be empowered to meditate on the larger question.
Think of it like a garden: If you plant one thing and plant it well, say, a chrysanthemum, you aren't done at simply planting the seed. In fact you'll need to ensure that the soil is well tended, that the area is sufficiently protected, that the season is right, etc.
As your chrysanthemum (your one thing) grows, so too will other competitive plants - some that need to be weeded out and others that may actually compliment the original one thing. In fact, these other organisms may in some cases prove vital for the survival of your one thing.
Like those features that YAGN, a bit of vigilance is required to determine which weeds represent feature creep and which represent vital and complimentary functions.
Regardless, having done it well means simply that your chrysanthemum is hearty, healthy, and on-time. :-)
I would say an email program without the ability to add attachments would be a good example.
This may sound like an odd example, but I'd say dropbox is a good, albeit complex example.
Its managed to beat off a swathe of similar competing apps, through a dedication to simplification and a lack of feature creep tha,t as you mentioned, would violate the 'do one thing' principle. The ap lets you store documents in a folder that you can access anywhere, and that's about the limit of it. They drilled down to the core problem, and solved it in a way that works perfectly well in 90+% of cases.
Its hard to put a hard and fast rule to it, but I'd say that catering to around the 90% majority of use cases and ignoring 'fringe requirements' is the best way to stick to this rule.
I'd guess 90+% of ls use is with no arguments or maybe two or three of the most popular. The 'do it well' principle should focus on what the majority of users need, instead of catering for power users or fringe cases, as ls does with its plethora of options.
This is what dropbox does successfully and why it is pretty well agreed upon as an example of good application design.
I seem to use bland words such as node, property, children (etc) too often, and I fear that someone else would have difficulty understanding my code simply because the parts' names are vague, common words.
How do you find creative names for classes and components to make them more memorable?
I am particularly having trouble with generic tools which have no real description except their rather generic functional purpose. I would like to know if others have found creative ways to name things rather than simply naming them by their utility, such as AnonymousFunctionWrapperCallerExecutorFactory.
It's hard to answer. I find them just because they seem to 'fit'.
What I do know, however, is that I find it basically impossible to move on writing code unless something is named correctly, and it 'feels' good. If it isn't named right, I find it hard to use, and the code is generally confusing.
I'm not too concerned about something being 'memorable', only 'accurate'.
I have been known to sit around thinking out loud about what to name something. Take your time, and make sure you are really happy with the name. don't be afraid of using common/simple words.
I don't really have an answer, but three things for you to think about.
The late Phil Karlton famously said: "There are only two hard problems in computer science. Cache Invalidation and Naming Things." So, the fact that you are having trouble coming up with good names is entirely normal and even expected.
OTOH, having trouble naming things can also be a sign of bad design. (And yes, I am perfectly aware, that #1 and #2 contradict each other. Or maybe one should think of it more like balancing each other.) E.g., if a thing has too many responsibilities, it is pretty much impossible to come up with a good name. (Witness all the "Service", "Util", "Model" and "Manager" classes in bad OO designs. Here's an example Google Code Search for "ManagerFactoryFactory".)
Also, your names should map to the domain jargon used by subject matter experts. If you can't find a subject matter expert, that's a sign that you are currently worrying about code that you're not supposed to worry about. (Basically, code that implements your core business domain should be implemented and designed well, code in ancillary domains should be implemented and designed so-so, and all other code should not be implemented or designed at all, but bought from a vendor, where what you are buying is their core business domain. [Please interpret "buy" and "vendor" liberally. Community-developed Free Software is just fine.])
Regarding #3 above, you mentioned in another comment that you are currently working on implementing a tree data structure. Unless your company is in the business of selling tree data structures, that is not a part of your core domain. And the reason that you have trouble finding good names could be that you are working outside your core domain. Now, "selling tree data structures" may sound stupid, but there are actually companies that do that. For example, the BCL team inside Microsoft's developer division: they actually sell (well, for certain definitions of "sell", anyway) the .NET framework's Base Class Libraries, which include, among others, tree data structures. But note that for example Microsoft's C++ compiler team actually (literally) buys their STL from a third-party vendor – they figure that their core domain is writing compilers, and they leave the writing of libraries to a company who considers writing STLs their core domain. (And indeed, AFAIK, that company does nothing but write and sell STL implementations. That's their sole product.)
If, however, selling tree data structures is your core domain, then the names you listed are just fine. They are the names that subject matter experts (programmers, in this case) use when talking about the domain of tree data structures.
Using 'metaphors' is a common theme in agile (and pattern) literature.
'Children' (in your question) is an example of a metaphor that is extensively used and for good reasons.
So, I'd encourage the use of metaphors, provided they are applicable and not a stretch of the imagination.
Metaphors are everywhere in computing. From files to bugs to pointers to streams... you can't avoid them.
I believe that for the purpose of standardization and communication, it's good to use a common vocab, like in the same case for design patterns. I have a problem with a programmer who keeps 'inventing' his own terms and I have trouble understanding him. (He kept using the term 'events orchestrating' instead of 'scripting' or 'FCFS process'. Kudos for creativity though!)
Those common vocab describe stuff we are used to. A node is a point, somewhere in a graph, in a tree, or what-not. One way is to be specific to the domain. If we are doing a mapping problem, instead of 'node', we can use 'location'. That helps in a sense, at least for me. So I find there is a need to balance being able to communicate with other programmers, and at the same time keeping the descriptor specific enough to help me remember what it does.
I think node, children, and property are great names. I can already guess the following about your classes, just by their "bland" names:
Node - this class is part of a graph of objects
children - this variable holds a list of nodes belonging to the containing node.
I don't think "node" is either vague or common, and if you're coding a generic data structure, it's probably ok to have generic names! (With that being said, if you are coding up a tree, you could use something like TreeNode to emphasize that the node is part of a tree.) One way you can make the life of developers who will use your API easier is to follow the naming conventions of your platform's built in libraries. If everyone calls a node a node, and an iterator an iterator, it makes life easy.
Names that reflect the purpose of the class, method or property are more memorable than creative ones. Modern IDEs make it easier to use longer names so feel fee to be descriptive. Getting creative won't help as much as getting accurate.
I recommend to pick nouns from a specific application domain. E.g. if you are putting cars in a tree, call the node class Car - the fact that it is also a node should be apparent from the API. Also, don't try to be too generic in your implementation - don't put all attributes of the car into a hashtable named properties, but create separate attributes for make, color, etc.
A lot of languages and coding styles like to use all sorts of descriptive prefixes. In PHP there are no clear types, so this may help greatly. Instead of doing
$isAvailable = true;
try
$bool_isAvailable = true;
It is admittedly a pain, but usually well worth the time.
I also like to use long names to describe things. It may seem strange, but is usually easier to remember, especially when I go back to refactor my code
$leftNode->properties < $leftTreeNode->arrayOfNodeProperties;
And if all else fails. Why not fall back on a solid star wars themed program.
$luke->lightsaber($darth[$ewoks]);
And lastly, in college I named my classes after my professor, and then my class methods all the things I wanted to do to that jerk.
$Kube->canEat($myShorts, $withKetchup);
When should you NOT use a singleton class although it might be very tempting to do so? It would be very nice if we had a list of most common instances of 'singletonitis' that we should take care to avoid.
Do not use a singleton for something that might evolve into a multipliable resource.
This probably sounds silly, but if you declare something a singleton you're making a very strong statement that it is absolutely unique. You're building code around it, more and more. And when you then find out after thousands of lines of code that it is not a singleton at all, you have a huge amount of work in front of you because all the other objects expect "the" sacred object of class WizBang to be a singleton.
Typical example: "There is only one database connection this application has, thus it is a singleton." - Bad idea. You may want to have several connections in the future. Better create a pool of database connections and populate it with just one instance. Acts like a Singleton, but all other code will have growable code for accessing the pool.
EDIT: I understand that theoretically you can extend a singleton into several objects. Yet there is no real life cycle (like pooling/unpooling) which means there is no real ownership of objects that have been handed out, i.e. the now multi-singleton would have to be stateless to be used simultaneously by different methods and threads.
Well singletons for the most part are just making things static anyway. So you're either in effect making data global, and we all know global variables are bad or you're writing static methods and that's not very OO now is it?
Here is a more detailed rant on why singletons are bad, by Steve Yegge. Basically you shouldn't use singletons in almost all cases, you can't really know that it's never going to be needed in more than one place.
I know many have answered with "when you have more than one", etc.
Since the original poster wanted a list of cases when you shouldn't use Singletons (rather than the top reason), I'll chime in with:
Whenever you're using it because you're not allowed to use a global!
The number of times I've had a junior engineer who has used a Singleton because they knew that I didn't accept globals in code-reviews. They often seem shocked when I point out that all they did was replace a global with a Singleton pattern and they still just have a global!
Here is a rant by my friend Alex Miller... It does not exactly enumerate "when you should NOT use a singleton" but it is a comprehensive, excellent post and argues that one should only use a singleton in rare instances, if at all.
I'm guilty of a big one a few years back (thankfully I've learned my lession since then).
What happened is that I came on board a desktop app project that had converted to .Net from VB6, and was a real mess. Things like 40-page (printed) functions and no real class structure. I built a class to encapsulate access to the database. Not a real data tier (yet), just a base class that a real data tier could use. Somewhere I got the bright idea to make this class a singleton. It worked okay for a year or so, and then we needed to build a web interface for the app as well. The singleton ended up being a huge bottleneck for the database, since all web users had to share the same connection. Again... lesson learned.
Looking back, it probably actually was the right choice for a short while, since it forced the other developers to be more disciplined about using it and made them aware of scoping issues not previously a problem in the VB6 world. But I should have changed it back after a few weeks before we had too much built up around it.
Singletons are virtually always a bad idea and generally useless/redundant since they are just a very limited simplification of a decent pattern.
Look up how Dependency Injection works. It solves the same problems, but in a much more useful way--in fact, you find it applies to many more parts of your design.
Although you can find DI libraries out there, you can also roll a basic one yourself, it's pretty easy.
I try to have only one singleton - an inversion of control / service locator object.
IService service = IoC.GetImplementationOf<IService>();
One of the things that tend to make it a nightmare is if it contains modifiable global state. I worked on a project where there were Singletons used all over the place for things that should have been solved in a completely different way (pass in strategies etc.) The "de-singletonification" was in some cases a major rewrite of parts of the system. I would argue that in the bigger part of the cases when people use a Singleton, it's just wrong b/c it looks nice in the first place, but turns into a problem especially in testing.
When you have multiple applications running in the same JVM.
A singleton is a singleton across the entire JVM, not just a single application. Even if multiple threads or applications seems to be creating a new singleton object, they're all using the same one if they run in the same JVM.
Sometimes, you assume there will only be one of a thing, then you turn out to be wrong.
Example, a database class. You assume you will only ever connect to your app's database.
// Its our database! We'll never need another
class Database
{
};
But wait! Your boss says, hook up to some other guys database. Say you want to add phpbb to the website and would like to poke its database to integrate some of its functionality. Should we make a new singleton or another instance of database? Most people agree that a new instance of the same class is preferred, there is no code duplication.
You'd rather have
Database ourDb;
Database otherDb;
than copy-past Database and make:
// Copy-pasted from our home-grown database.
class OtherGuysDatabase
{
};
The slippery slope here is that you might stop thinking about making new instance of classes and instead begin thinking its ok to have one type per every instance.
In the case of a connection (for instance), it makes sense that you wouldn't want to make the connection itself a singleton, you might need four connections, or you may need to destroy and recreate the connection a number of times.
But why wouldn't you access all of your connections through a single interface (i.e. connection manager)?