What is the difference between Single Responsibility Principle and Separation of Concerns?
Single Responsibility Principle (SRP)-
give each class just one reason to
change; and “Reason to change” ==
“responsibility”. In example: Invoice
class does not have a responsibility
to print itself.
Separation of Concerns (since 1974).
Concern == feature of system. Taking
care of each of the concerns: for each
one concern, other concerns are
irrelevant. Hiding implementation of
behavior.
From here.
The Single Responsibility Principle and Separation of Concerns are really the same thing.
Sure, you can get bogged down in an academic discussion trying to tease out some kind of difference between the two, but why? For all intents and purposes, they describe the same thing. The biggest problem is people get so caught up in wanting to know exactly what a "concern" and "responsibility" are, that they perhaps miss the important idea behind SRP and SoC.
That idea is simply to split your codebase into loosely coupled, isolated parts. This allows multiple developers to work on different parts without affecting each other, it also allows a single developer to modify one isolated part without breaking another.
This is applied at the module level, e.g. MVC is an architectural pattern promoting SRP and SoC. The codebase is split out into isolated models, views and controllers. That way, the modification of a view can be done independently of a model. The two aren't horrifically intertwined.
At a lower level, this should be applied to classes too. Instead of putting dozens of methods in a single class, split the code out into several. For the same reasons.
Also, even at a method level, split large methods out into smaller methods.
In principle. SRP is a principle, not a rule, so you don't have to (read: can't/shouldn't) follow it religiously to the extreme. It doesn't mean going too far and having only one seven line method in each class, for example. It just means a general principle of splitting out code into isolated parts. The point is it will lead to a better codebase and more stable software.
In my opinion Single Responsibility Principle is one of the tools/idioms to achieve Separation of Concerns.
Separation of Concern vs Single Responsibility Principle ( SoC vs SRP )
From the linked article:
Separation of Concerns (SoC) – is the process of breaking a computer program into distinct features that overlap in functionality as little as possible. A concern is any piece of interest or focus in a program. Typically, concerns are synonymous with features or behaviors.
http://en.wikipedia.org/wiki/Separation_of_concerns
Single Responsibility Principle (SRP) – every object should have a single responsibility, and that all its services should be narrowly aligned with that responsibility. On some level Cohesion is considered as synonym for SRP.
http://en.wikipedia.org/wiki/Single_responsibility_principle
Single Responsibility states that an Object be responsible for a single unit of work.
Seperation of Concerns states that applications should be split in to modules whose functionalities overlap as little as possible.
Similar end results...slightly different applications.
Since none of the previous answers quote Robert Martin, who created the Single Responsibility Principle, I think a more authoritative answer is needed here.
Martin's inspiration for the SRP was drawn from David Parnas, Edsger Dijkstra (who coined the term Separation of Concerns) and Larry Constantine (who coined the terms Coupling and Cohesion). Martin consolidated their ideas into the SRP.
Another wording for the Single Responsibility Principle is:
Gather together the things that change for the same reasons. Separate those things that change for different reasons.
If you think about this you’ll realize that this is just another way to define cohesion and coupling. We want to increase the cohesion between things that change for the same reasons, and we want to decrease the coupling between those things that change for different reasons.
However, as you think about this principle, remember that the reasons for change are people. It is people who request changes. And you don’t want to confuse those people, or yourself, by mixing together the code that many different people care about for different reasons.
To the original question, the minor difference between the SRP and SoC is that Martin refined the term concerns to refer to people.
Separation of concerns (SoC). Divide your application into distinct features with as little overlap in functionality as possible. (Microsoft).
“Concern” = “distinct feature” = “distinct section”
“Concern” works at both high and low levels
Single responsibility principle states that every module or class should have responsibility over a single part of the functionality provided by the software, and that responsibility should be entirely encapsulated by the class. All its services should be narrowly aligned with that responsibility. (Wikipedia definition)
“Responsibility” = “reason to change” change what? “a single part of the functionality provided by the software” = Basic Unit
Conclusion
Single Responsibility Principle works on basic units -> works at low level
Separation of Concerns works at both high and low levels
SRP and SoC work together for separation of concerns. They are
exactly the same at low level
Separation of Concerns is a process; the Single Responsibility Principle is a design / architecture philosophy. They're not completely disjoint, but they serve different purposes.
Similar but: SoC is related to concerns: to break down a complex problem into several concerns, SRP is to have just one responsibility.
SRP and SOC work on different level of abstraction. The purpose is in both cases to reduce the coupling and strengthen the cohesion. While SRP works more on an object level, SOC may also work on the implementation of function level. A function can be implemented by one object but also by several objects. Therefore the coupling and the cohesion of both principles may differ.
I tried to draw a comparison between Separation of concerns(SoC) and Single Responsibility Principle(SRP).
Differences
The SRP is at the class level, but SoC is in each computer program, abstraction ... or sometimes architectural level.
The SRP is about the quality of (how not what) dividing your domain into cohesive classes which have just one responsibility (one reason to change). In other side, SoC is a design principle for separating a context into distinct sections, such that each section addresses a separate concern(what not how), as there are a lot of tools (for example classes, functions, modules, packages, ...) to reach this goal different levels.
The concept of SRP is based on the cohesion (high cohesion), whereas SoC is close to Molecularity, divide and conquer (D&C), ... in each level of abstraction.
SoC is a good design principle to cope complexity, such as abstraction whereas to reach single responsible classes you can use SoC principle as a great solution. As, a way to know that a class has more than one responsibility is if you can extract another responsibility(concern) from that class.
Similarities
After applying each of these principles, your context becomes more reusable, maintainable, robust, readable, ... .
Answer:
Separation of Concerns (SoC) is a more versatile term - it can be applied at the system level or at lower levels such as classes (or even the methods within a class)
Single Responsibility Principle (SRP) is used to discuss SoC at the lower levels e.g. in a class
Ways to think about it:
At the low level, SoC and SRP are synonymous. So you could say SRP is a redundant term - or that SoC should only be used to discuss the system level
Given (1), the term SoC is somewhat ambiguous. You need context to know whether the discussion is about high level SoC or lower level SoC
To remember that SRP is the term for only lower levels, think of this: in everyday language, a "responsibility" is usually a well-defined thing that can be tied to specific code, whereas "concerns" are usually kind of vague and may encompass a bunch of related things, which is perhaps why SoC is a more natural fit for discussing the system level than is SRP
SoC is in some sense a stronger requirement/principle than SRP because it applies at the system level, and to be truly achieved at the system level must also be used in the development of the system components. That is, a high level of SoC implies decent SoC/SRP at the lower levels - but the reverse is not true, that is, lower level SoC/SRP does not imply SoC or anything at all for the next higher level, never mind the encompassing system. For an example of SoC/SRP that is achieved at the method level, but then violated at the class level, check out this Artur Trosin's blog post.
Separation of Concerns
The Separation of Concerns (SOC) principle states that a code artifact should allow one to focus one's attention upon a certain aspect. A code artifact can be anything, from a specific function, to a class, or a whole package, or even an entire application. The SOC principle can be applied to every architectural level in one's applications. An example of an architecture where SOC is applied is a layered architecture.
Single Responsibility Principle
The Single Responsibility Principle (SRP) states that "A module should have one, and only one, reason to change" (Clean Architecture, Martin, p. 62). The SRP applies to the module level and when applying the SRP one should be focused on the reason to change.
Conclusion
The SOC principle states that a code artifact should allow one to focus one's attention upon a certain aspect. The SRP makes this concrete by stating that at the module level we should focus our attention at the reason of change. Thus the SRP is a SOC.
P.S. For completeness: the Common Closure Principle
The Common Closure Principle (CCP) is a restatement of the SRP at an even higher level, the component level. The CCP states that classes that change for the same reasons and at the same times should be gathered into the same components (Clean Architecture, p. 105). The CCP is another example of a SOC.
Related
I can grasp the part "do one thing" via encapsulation, Dependency Injection, Principle of Least Knowledge, and You Ain't Gonna Need It; but how do I understand the second part "do it well?"
An example given was the notion of completeness, given in the same YAGNI article:
for example, among features which allow adding items, deleting items, or modifying items, completeness could be used to also recommend "renaming items".
However, I found reasoning like that could easily be abused into feature creep, thus violating the "do one thing" part.
So, what is a litmus test for seeing rather a feature belongs to the "do it well" category (hence, include it into the function/class/program) or to the other "do one thing" category (hence, exclude it)?
The first part, "do one thing," is best understood via UNIX's ls command as a counterexample for its inclusion of excessive number of flags for formatting its output, which should have been completely delegated to another external program. But I don't have a good example to see the second part "do it well."
What is a good example where removing any further feature would make it not "do it well?"
I see "Do It Well" as being as much about quality of implementation of a function than about the completeness of a set functions (in your example having rename, as well as create and delete).
Do It Well manifests in many ways, some ways of thinking:
Behaviour in response to "special" inputs. Example, calculating the mean of some integers:
int mean(int[] values) { ... }
what does this do if the array has zero elements? If the items total more than MAX_INT?
Performance Characteristics. Has sufficient attention been given to behaviour as the data volumes increase?
Dependency Failures. If our implementation depends upon other modules or infrastructure what happens when these fail. Example: File System Full, Database Down?
Concerning feature creep itself, I think you're correct to indentify a tension here. One thing you might consider: you don't need to implment every feature providing that it's pretty obvious that a feature can be added easily without a complete rewrite.
The whole purpose of this advice is to make you favor quality over quantity.
The concept of one thing is subjective and depends on granularity. Would you say that a spreadsheet application does more than one thing if it can also print, or is that part of that one thing?
The point is that you should make sure that any feature, and the application itself, is done and will delight customers before you scramble to add new features.
I think your question points out the fundamentally organic nature of feature creep, and in understanding that nature, you will be empowered to meditate on the larger question.
Think of it like a garden: If you plant one thing and plant it well, say, a chrysanthemum, you aren't done at simply planting the seed. In fact you'll need to ensure that the soil is well tended, that the area is sufficiently protected, that the season is right, etc.
As your chrysanthemum (your one thing) grows, so too will other competitive plants - some that need to be weeded out and others that may actually compliment the original one thing. In fact, these other organisms may in some cases prove vital for the survival of your one thing.
Like those features that YAGN, a bit of vigilance is required to determine which weeds represent feature creep and which represent vital and complimentary functions.
Regardless, having done it well means simply that your chrysanthemum is hearty, healthy, and on-time. :-)
I would say an email program without the ability to add attachments would be a good example.
This may sound like an odd example, but I'd say dropbox is a good, albeit complex example.
Its managed to beat off a swathe of similar competing apps, through a dedication to simplification and a lack of feature creep tha,t as you mentioned, would violate the 'do one thing' principle. The ap lets you store documents in a folder that you can access anywhere, and that's about the limit of it. They drilled down to the core problem, and solved it in a way that works perfectly well in 90+% of cases.
Its hard to put a hard and fast rule to it, but I'd say that catering to around the 90% majority of use cases and ignoring 'fringe requirements' is the best way to stick to this rule.
I'd guess 90+% of ls use is with no arguments or maybe two or three of the most popular. The 'do it well' principle should focus on what the majority of users need, instead of catering for power users or fringe cases, as ls does with its plethora of options.
This is what dropbox does successfully and why it is pretty well agreed upon as an example of good application design.
Obviously, "Hello World" doesn't require a separated, modular front-end and back-end. But any sort of Enterprise-grade project does.
Assuming some sort of spectrum between these points, at which stage should an application be (conceptually, or at a design level) multi-layered? When a database, or some external resource is introduced? When you find that the you're anticipating spaghetti code in your methods/functions?
when a database, or some external resource is introduced.
but also:
always (except for the most trivial of apps) separate AT LEAST presentation tier and application tier
see:
http://en.wikipedia.org/wiki/Multitier_architecture
Layers are a mean to keep a design loosely coupled and highly cohesive.
When you start to have a few classes (either implemented or just sketched with UML), they can be grouped logically, into layers - or more generally packages, or modules. This is called the art of separating the concerns.
The sooner the better: if you do not start layering early enough, then you risk to have never do it as the effort can be too important.
Here are some criteria of when to...
Any time you anticipate the need to
replace one part of it with a
different part.
Any time you find
yourself need to divide work amongst
parallel team.
There is no real answer to this question. It depends largely on your application's needs, and numerous other factors. I'd suggest reading some books on design patterns and enterprise application architecture. These two are invaluable:
Design Patterns: Elements of Reusable Object-Oriented Software
Patterns of Enterprise Application Architecture
Some other books that I highly recommend are:
The Pragmatic Programmer: From Journeyman to Master
Refactoring: Improving the Design of Existing Code
No matter your skill level, reading these will really open your eyes to a world of possibilities.
I'd say in most cases dealing with multiple distinct levels of abstraction in the concepts your code deals with would be a strong signal to mirror this with levels of abstraction in your implementation.
This does not override the scenarios that others have highlighted already though.
I think once you ask yourself "hmm should I layer this" the answer is yes.
I've worked on too many projects that probably started off as proof of concept/prototype that ended up being full projects used in production, which are horribly written and just wreak of "get it done quick, we'll fix it later." Trust me, you wont fix it later.
The Pragmatic Programmer lists this as the Broken Window Theory.
Try and always do it right from the start. Separate your concerns. Build it with modularity in mind.
And of course try and think of the poor maintenance programmer who might take over when you're done!
Thinking of it in terms of layers is a little limiting. It's what you see in whitepapers about a product, but it's not how products really work. They have "boxes" that depend on each other in various ways, and you can make it look like they fit into layers but you can do this in several different configurations, depending on what information you're leaving out of the diagram.
And in a really well-designed application, the boxes get very small. They are down to the level of individual interfaces and classes.
This is important because whenever you change a line of code, you need to have some understanding of the impact your change will have, which means you have to understand exactly what the code currently does, what its responsibilities are, which means it has to be a small chunk that has a single responsibility, implementing an interface that doesn't cause clients to be dependent on things they don't need (the S and the I of SOLID).
You may find that your application can look like it has two or three simple layers, if you narrow your eyes, but it may not. That isn't really a problem. Of course, a disastrously badly designed application can look like it has layers tiers if you squint as hard as you can. So those "high level" diagrams of an "architecture" can hide a multitude of sins.
My generic rule of thumb is to at least to separate the problem into a model and view layer, and throw in a controller if there is a possibility of more than one ways of handling the model or piping data to the view.
(Or as the first answer, at least the presentation tier and the application tier).
Loose coupling is all about minimising dependencies, so I would say 'layer' when a dependency is introduced. i.e. a database, third party application, etc.
Although 'layer' is probably the wrong term these days. Most of the time I use Dependency Injection (DI) through an Inversion of Control container such as Castle Windsor. This means that I can code on one part of my system without worrying about the rest. It has the side effect of ensuring loose coupling.
I would recommend DI as a general programming principle all of the time so that you have the choice on how to 'layer' your application later.
Give it a look.
R
I know the rule of thumb is that a noun used by the user is potentially a class. Similarly, a verb may be made into an action class e.g. predicate
Given a description from the user, how do you -
identify what is not not to be made into a class
The only real answer is experience. However, some things fairly obviously (to me, anyway) cannot be modelled in your design. For example if the use case says:
"and then the parcel is put on the UPS van"
There is no need to model the van. You can make decisions of this kind by considering the system boundaries - you don't and can't control the van. However,
"we make a request to UPS for pickup"
might well result in a UPSPickup object.
The rules are simple.
Everything is an object.
All objects belong to classes.
In rare (very rare circumstances) you have some specialized class/object confusion.
A "library" of all static methods. This is an implementation choice, and no user can see this.
A Singleton where there can only be one object of the given class. This does happen sometimes.
In an OO language it is not a question of what to be made into a class, but rather 'what class does this data/functionality go into?'
Like other software architecture aspects there are rules, but ultimately it is an art that requires experience. There are lots of books on software design, but a simple reference is Coupling and Cohesion.
Cohesion of a single module/component is the degree to which its responsibilities form a meaningful unit; higher cohesion is better.
Coupling between modules/components is their degree of mutual interdependence; lower coupling is better.
I was just looking at this question about SQL, and followed a link about DAO to wikipedia. And it mentions as a disadvantage:
"As with many design patterns, a design pattern increases the complexity of the application." -Wikipedia
Which suddenly made me wonder where this idea came from (because it lacks a citation). Personally I always considered patterns reduce to complexity of an application, but I might be delusional, so I'm wondering if this complexity is based on something or not.
Thanks.
If the person reading the code is aware of design patterns and their concept and is able to identify design patterns in practical use (not just the book examples) then they really do reduce the complexity.
However I've found with a lot of junior developers, which haven't heard much about design patterns or weren't aware of them at all, that they believe their use increases the complexity of the code.
I can understand it: You suddenly have more classes or code to go through to solve what seems to be a simple problem at first. If you're not aware of the benefits of design patterns, hacked solutions always look better.
I love design patterns, but they (apart from simple ones like Singleton) definitely add complexity to an application. They add some dimension to a design that is not intuitively obvious to a novice designer (and not part of the features of the programming language).
Some people might feel patterns reduce complexity because of the benefits they bring in terms of the software's non-functional requirements such as maintainability, extendability, reusability, etc. However, I disagree and see the benefits as a return on complexity investment. Perhaps in some cases patterns reduce complexity, but a theoretical discussion like that sheds more heat than light. Almost none of the answers so far used concrete examples, save https://stackoverflow.com/a/760968/1168342.
To be specific, many patterns increase accidental complexity of a design by introducing new structures (interfaces, methods, etc.) that weren't present in the design before the pattern was applied.
Let's use Visitor as an example.
Visitor is a way of separating operations from an object structure on which they operate.
Before the solution with Visitor, the operations are hard-coded into each Element of the object structure. The challenge for the developer is that adding new operations involves modifying the code in the various elements.
After the application of the Visitor pattern, there is an additional class hierarchy of visitors, which encapsulate the operations.
The flow of
control in the solution is definitely more complex, and will be harder
to debug (anyone who has implemented Visitor and tried to follow the
program flow of double-dispatched calls with accept/visit will know this).
Understanding and maintaining Visitor functions in terms of
cohesive units is less complicated than the alternative of coding
functions into each of the Elements in the fixed structure that is
visited. This is the benefit of the pattern.
It's difficult to say quantitatively how much increase there is in accidental complexity or how much easier it is to add new operations. I certainly don't agree with answers that make a blanket statement saying in the long-term, complexity is reduced with applying a pattern. It's not like your design "forgets" the double-dispatch added by Visitor's approach, just because you have code which more easily allows operations to be added. The complexity is a price (or tax) you pay to get the benefit in maintainability.
Patterns still have to be applied
Regardless of one's supposed familiarity with patterns, any given pattern must be applied to a solution. That application is going to be different every time (Martin Fowler said patterns are only half-baked solutions). Developers will always have to understand what classes are playing what roles in the existing design, which is subject to the essential complexity (the application problem's complexity) that is often non-trivial.
In the best case, understanding a design pattern applied in an application that's already complex may be trivial, but it's not 0 effort:
Patterns aren't always applied the same way. There are many variants of patterns -- Proxy comes to mind. I'm not sure that everyone agrees about how any given pattern should be applied.
Introducing one pattern (e.g., Strategy to encapsulate algorithms) often leads to other patterns to manage things properly (e.g., Factory to instantiate the concrete Strategies).
Introducing a pattern often leads to more responsibilities. Object cleanup when a Factory is used is not trivial (and also not documented in GoF). How many know about the so-called Lapsed-listener problem?
What happens if there is a change in the assumptions made about the need for the pattern (e.g., there is no longer a need to have multiple encapsulated algorithms provided by the Strategy pattern)? It's going to be extra work to remove the pattern later. If you don't remove it, new developers could be duped by its presence when they come on board. Patterns are intertwined between the classes playing the roles in the pattern. Removal is not trivial.
Erich Gamma gave an anecdote at ECOOP 2006 that designers in one case decided to remove the Abstract Factory pattern from a commercial multi-platform GUI widget framework (the classic Abstract Factory example!). As I remember the anecdote, the multiple-levels of indirection (polymorphic calls) in complex GUIs was a significant performance hit in the client code. Customers complained about GUIs being sluggish, and the "optimization" was to remove the indirections. In this case, performance trumped maintainability; the pattern was only making the coders happy, not the end users.
DAO example
In terms of the DAO example you cite in the question, if you're coding an application that will never need to run with varying databases, then the DAO pattern is an unneeded level of complexity. In general, if your code doesn't need the benefit that a pattern is supposed to provide, applying that pattern will increase your application's complexity unnecessarily.
Revolving door metaphor
Using buildings as a metaphor, let's consider a revolving door as a building design pattern. The following image comes from Wikipedia:
You can "see" the additional complexity in such a door. The benefits of revolving doors are in energy savings. They attempt to solve the problem where there are people frequently going in and out of a building, and opening/closing a standard door allows too much air to be exchanged between the inside the outside of the building each time.
It probably wouldn't make sense to install a revolving door as the entrance of a two-bedroom house, because there is not enough traffic to justify the additional complexity. The revolving door would still work in a two-bedroom house. The benefits in terms of energy savings would be small (and might actually be worse because of size and air-tightness relative to a conventional door). The door would surely cost more and would take up more space than a traditional door.
Design patterns often lead to additional levels of abstraction around a problem, and if not handled correctly then too much abstraction can lead to complexity.
However, since design patterns provide a common vocabulary to communicate ideas they also reduce complexity and increase maintainability.
At the end of the day it's a double-edged sword, but I can't imagine a situation where I'd avoid using a design pattern...
There's an infamous disease known as "Small Boy With A Pattern Syndrome" that usually strikes someone who has recently read the GoF book for the first time and suddenly sees patterns everywhere. That can add complexity and unnecessary abstraction.
Patterns are best added to code as a discovery or refactoring to solve a particular problem, in my opinion.
In the short term, design patterns will often increase the complexity of the code. They add extra abstractions in places they might not be strictly necessary. However, in the long term they reduce complexity because future enhancements and changes fit into the patterns in a simple way. Without the patterns, these changes would be much more intrusive and complexity would likely be much higher.
Take for example a decorator pattern. The first time you use it, it will add complexity because now you have to define an interface for the object and create another class to add the decoration. This could likely be done with a simple property and be done with. However, when you get to 5 or 20 decoarations, the complexity with a decorator is much less than with properties.
As Grover said, the power of Design patterns is dependent on the ability of programmers to recognize them when they see them. It's like reducing a mathematical problem to a simpler problem, and them solving the simpler one. To someone who doesn't realize this, though, it seems like you've just created another problem.
I think it's always a good idea to document explicitly, using comments and/or descriptive names, when you're using a pattern to solve a problem. This might even educate another programmer who comes across it about the pattern if he wasn't aware of this.
I think it depends on the "audience" i.e. the maintaining developers of the code base. If they are design pattern illiterate then yes it can increase complexity, because most things one doesn't understand are "complex".
If the team is design pattern literate, i.e. they understand the basics and understand the premise behind why design patterns are useful (and as important when they're not) then I think they reduce complexity.
After all Computer Science maybe a fledging science but it's got decades of experience under its belt. The chances are somebody has already solved your problem once before. Whether the answer is a design pattern, data structure or algorithm.
I rather like this humorous explanation by Dylan Beattie. I recommend the read (if nothing else to waste five minutes on a Friday morning!)
Design Patterns work like algorithms for Object Oriented Programming. Shows you how to put together objects in a meaningful relationship that performs a particular task. So I would say yes they reduce complexity by allowing you to understand the design of the software better. The same way with algorithms in procedural programs.
If I told you that X was using a Linked List with a Bubble Sort it would be a lot easier to following along with what the programmer was doing.
Design patterns increase the code and divide it into multiple parts. If the design pattern and concept is known then it doesn't sound complex but code based on design pattern you don't know then it looks complex.
I made some research about this topic in the scope of GoF patterns. I observed OO Metric value fluctuations after the design pattern refactorings, you can check it out using this link.
Design patterns definitely adds complexity upfront in return for more modular, maintainable, flexible and extensible code in the long run. Perhaps the iterator pattern epitomizes it best.
Using index to iterate a list is clearly very simple and intuitive thing to do yet design patterns encapsulates it an iterator with is definitely more complex than a simple index.
The trade off is that it will be fair to say it has taken away the (implementation) simplicity but in return has made it more flexible removing iteration responsibility from the container and objectifying it which can be reusable.
That's design patterns for you.
Design Patterns dont increase complexity, it can make things alot easier to read and maintain. It can be harder to new comers to integrate your developer team, but this effort will be benefitial as a whole.
Problem are Design Patterns as a whole, but the its abuse
They should neither increase nor decrease the complexity.
You should always use an appropriate design for your code. This may use common design patterns or not.
The main benefits to design patterns are
By learning them you have added more design tools to your toolbox
By learning their names, if you use them and put a comment stating the pattern you're using, it helps readers understand your design intent more concisely
When I teach patterns at Hopkins, the two big things I stress are:
Patterns are all about Communication of Intent
Don't use any of the specific patterns as a Golden Hammer; lock them in your toolbox and only pull them out if it makes sense for your application.
In the same line as Database Normalization - is there an approach to object normalization, not design pattern, but the same mathematical like approach to normalizing object creation. For example: first normal form: no repeating fields....
here's some links to DB Normalization:
http://en.wikipedia.org/wiki/Database_normalization
http://databases.about.com/od/specificproducts/a/normalization.htm
Would this make object creation and self-documentation better?
Here's a link to a book about class normalization (guess we're really talking about classes)
http://www.agiledata.org/essays/classNormalization.html
Normalization has a mathematical foundation in predicate logic, and a clear and specific goal that the same piece of information never be represented twice in a single model; the purpose of this goal is to eliminate the possibility of inconsistent information in a data model. It can be shown via mathematical proof that if a data model has certain specific properties (that it passes tests for 1st Normal Form (1NF), 2NF, 3NF, etc.) that it is free from redundant data representation, i.e. it is Normalized.
Object orientation has no such underlying mathematical basis, and indeed, no clear and specific goal. It is simply a design idea for introducing more abstraction. The DRY principle, Command-Query Separation, Liskov Substitution Principle, Open-Closed Principle, Tell-Don't-Ask, Dependency Inversion Principle, and other heuristics for improving quality of code (many of which apply to code in general, not just object oriented programs) are not absolute in nature; they are guidelines that programmers have found useful in improving understandability, maintainability, and testability of their code.
With a relational data model, you can say with absolute certainty whether it is "normalized" or not, because it must pass ALL the tests for normal form, and they are quite specific. With an object model, on the other hand, because the goal of "understandable, maintainable, testable, etc" is rather vague, you cannot say with any certainty whether you have met that goal. With many of the design heuristics, you cannot even say for sure whether you have followed them. Have you followed the DRY principle if you're applying patterns to your design? Surely repeated use of a pattern isn't DRY? Furthermore, some of these heuristics or principles aren't always even necessarily good advice all the time. I do try to follow Command-Query Separation, but such useful things as a Stack or a Queue violate that concept in order to give us a rather elegant and useful result.
I guess the Single Responsible Principle is at least related to this. Or at least, violation of the SRP is similar to a lack of normalization in some ways.
(It's possible I'm talking rubbish. I'm pretty tired.)
Interesting.
You may also be interested in looking at the Law of Demeter.
Another thing you may be interested in is c2's FearOfAddingClasses, as, arguably, the same reasoning that lead programmers to denormalise databases also leads to god classes and other code smells. For both OO and DB normalisation, we want to decompose everything. For databases this means more tables, for OO, more classes.
Now, it is worth bearing in mind the object relational impedance mismatch, that is, probably not everything will translate cleanly.
Object relational models or 'persistence layers', usually have 1-to-1 mappings between object attributes and database fields. So, can we normalise? Say we have department object with employee1, employee2 ... etc. attributes. Obviously that should be replaced with a list of employees. So we can say 1NF works.
With that in mind, let's go straight for the kill and look at 6NF database design, a good example is Anchor Modeling, (ignore the naming convention). Anchor Modeling/6NF provides highly decomposed and flexible database schemas; how does this translate to OO 'normalisation'?
Anchor Modeling has these kinds of relationships:
Anchors - unique object IDs.
Attributes, which translate to object attributes: (Anchor, value, metadata).
Ties - relationships between two or more objects (themselves anchors): (Anchor, Anchor... , metadata)
Knots, attributed Ties.
Attribute metadata can be anything - who changed an attribute, when, why, etc.
The OO translation of this is looks extremely flexible:
Anchors suggest attribute-less placeholders, like a proxy which knows how to deal with the attribute composition.
Attributes suggest classes representing attributes and what they belong to. This suggests applying reuse to how attributes are looked up and dealt with, e.g automatic constraint checking, etc. From this we have a basis to generically implement the GOF-style Structural patterns.
Ties and Knots suggest classes representing relationships between objects. A basis for generic implementation of the Behavioural design patterns?
Interesting and desirable properties of Anchor Modeling that also translate across are:
All this requires replacing inheritance with composition (good) in the exposed objects.
Attribute have owners rather than owners having attributes. Although this make attribute lookup more complex, it neatly solves certain aliasing problems, as there can only ever be one owner.
No need for NULL. This translates to clearer NULL handling. Empty-case attribute classes could provide methods for handling the lack of a particular attribute, instead of performing NULL-checking everywhere.
Attribute metadata. Attribute-level full historisation and blaming: 'play' objects back in time, see what changed, when and why, etc. (if required - metadata is entirely optional)
There would probably be a lot of very simple classes (which is good), and a very declarative programming style (also good).
Thanks for such a thought provoking question, I hope this is useful for you.
Perhaps you're taking this from a relational point-of-view, but I would posit that the principles of interfaces and inheritance correspond to normalization in the world of OOP.
For example, a Person abstract class containing FirstName, LastName, Gender and BirthDate can be used by classes such as Employee, User, Member etc. as a valid base class, without a need to repeat the definitions of those attributes in such subclasses.
The principle of DRY, (a core principle of Andy Hunt and Dave Thomas's book The Pragmatic Programmer), and the constant emphasis of object-oriented programming on re-use, also correspond to the efficiencies offered by Normalization in relational databases.
At first glance, I'd say that the objectives of Code Refactoring are similar in an abstract way to the objectives of normalization. But that's pretty abstract.
Update: I almost wrote earlier that "we need to get Jon Skeet in on this one." I posted my answer and who beat me? You guessed it...
Object Role Modeling (not to be confused with Object Relational Mapping) is the closest thing I know of to normalization for objects. It doesn't have as mathematical a foundation as normalization, but it's a start.
In a fairly ad-hoc and untutored fashion, that will probably cause purists to scoff, and perhaps rightly, I think of a database table as being a set of objects of a particular type, and vice versa. Then I take my thoughts from there. Viewed this way, it doesn't seem to me like there's anything particularly special you have to do to use normal form in your everyday programming. Each object's identity will do for starters as its primary key, and references (pointers, etc.) will do by way of foreign keys. Then just follow the same rules.
(My objects usually end up in 3NF, or some approximation thereof. I treat this all more as guidelines, and, like I said, "untutored".)
If the rules are followed properly, each bit of information then ends up in one place, the interrelationships are clear, and everything is structured such that introducing inconsistencies takes some work. One could say that this approach produces good results on this basis, and I would agree.
One downside is that the end result can feel a bit like a tangle of spaghetti, particularly after some time away, and it's hard to shake the constant lingering sensation, even though it's usually false, that surely a few of all these links could be removed...
Object oriented design is rational but it does not have the same mathematically well-defined basis as the Relational Model. There is nothing exactly equivalent to the well-defined normal forms of database design.
Whether this is a strength or a weakness of Object oriented design is a matter of interpretation.
I second the SRP. The Open Closed Principle applies as well to "normalization" although I might stretch the meaning of the word, in that it should be possible to extend the system by adding new implementations, without modifying the existing code. objectmentor about OCP
good question, sorry i can't answer in depth
I've been working on object normalization off and on for over 20 years. It's deep and complicated and beautiful, and is the subject of my second planned book, Object Mechanics II. ONF = Object Normal Form, you heard it here first! ;-)
since potentially patentable technology lurks within, I am not at liberty to say more, except that normalizing the data is the really easy part ;-)
ADDENDUM: change of plans - see https://softwareengineering.stackexchange.com/questions/84598/object-oriented-normalization