Why prefer composition over inheritance? What trade-offs are there for each approach? When should you choose inheritance over composition?
Prefer composition over inheritance as it is more malleable / easy to modify later, but do not use a compose-always approach. With composition, it's easy to change behavior on the fly with Dependency Injection / Setters. Inheritance is more rigid as most languages do not allow you to derive from more than one type. So the goose is more or less cooked once you derive from TypeA.
My acid test for the above is:
Does TypeB want to expose the complete interface (all public methods no less) of TypeA such that TypeB can be used where TypeA is expected? Indicates Inheritance.
e.g. A Cessna biplane will expose the complete interface of an airplane, if not more. So that makes it fit to derive from Airplane.
Does TypeB want only some/part of the behavior exposed by TypeA? Indicates need for Composition.
e.g. A Bird may need only the fly behavior of an Airplane. In this case, it makes sense to extract it out as an interface / class / both and make it a member of both classes.
Update: Just came back to my answer and it seems now that it is incomplete without a specific mention of Barbara Liskov's Liskov Substitution Principle as a test for 'Should I be inheriting from this type?'
Think of containment as a has a relationship. A car "has an" engine, a person "has a" name, etc.
Think of inheritance as an is a relationship. A car "is a" vehicle, a person "is a" mammal, etc.
I take no credit for this approach. I took it straight from the Second Edition of Code Complete by Steve McConnell, Section 6.3.
If you understand the difference, it's easier to explain.
Procedural Code
An example of this is PHP without the use of classes (particularly before PHP5). All logic is encoded in a set of functions. You may include other files containing helper functions and so on and conduct your business logic by passing data around in functions. This can be very hard to manage as the application grows. PHP5 tries to remedy this by offering a more object-oriented design.
Inheritance
This encourages the use of classes. Inheritance is one of the three tenets of OO design (inheritance, polymorphism, encapsulation).
class Person {
String Title;
String Name;
Int Age
}
class Employee : Person {
Int Salary;
String Title;
}
This is inheritance at work. The Employee "is a" Person or inherits from Person. All inheritance relationships are "is-a" relationships. Employee also shadows the Title property from Person, meaning Employee.Title will return the Title for the Employee and not the Person.
Composition
Composition is favoured over inheritance. To put it very simply you would have:
class Person {
String Title;
String Name;
Int Age;
public Person(String title, String name, String age) {
this.Title = title;
this.Name = name;
this.Age = age;
}
}
class Employee {
Int Salary;
private Person person;
public Employee(Person p, Int salary) {
this.person = p;
this.Salary = salary;
}
}
Person johnny = new Person ("Mr.", "John", 25);
Employee john = new Employee (johnny, 50000);
Composition is typically "has a" or "uses a" relationship. Here the Employee class has a Person. It does not inherit from Person but instead gets the Person object passed to it, which is why it "has a" Person.
Composition over Inheritance
Now say you want to create a Manager type so you end up with:
class Manager : Person, Employee {
...
}
This example will work fine, however, what if Person and Employee both declared Title? Should Manager.Title return "Manager of Operations" or "Mr."? Under composition this ambiguity is better handled:
Class Manager {
public string Title;
public Manager(Person p, Employee e)
{
this.Title = e.Title;
}
}
The Manager object is composed of an Employee and a Person. The Title behaviour is taken from Employee. This explicit composition removes ambiguity among other things and you'll encounter fewer bugs.
With all the undeniable benefits provided by inheritance, here's some of its disadvantages.
Disadvantages of Inheritance:
You can't change the implementation inherited from super classes at runtime (obviously because inheritance is defined at compile time).
Inheritance exposes a subclass to details of its parent class implementation, that's why it's often said that inheritance breaks encapsulation (in a sense that you really need to focus on interfaces only not implementation, so reusing by sub classing is not always preferred).
The tight coupling provided by inheritance makes the implementation of a subclass very bound up with the implementation of a super class that any change in the parent implementation will force the sub class to change.
Excessive reusing by sub-classing can make the inheritance stack very deep and very confusing too.
On the other hand Object composition is defined at runtime through objects acquiring references to other objects. In such a case these objects will never be able to reach each-other's protected data (no encapsulation break) and will be forced to respect each other's interface. And in this case also, implementation dependencies will be a lot less than in case of inheritance.
Another, very pragmatic reason, to prefer composition over inheritance has to do with your domain model, and mapping it to a relational database. It's really hard to map inheritance to the SQL model (you end up with all sorts of hacky workarounds, like creating columns that aren't always used, using views, etc). Some ORMLs try to deal with this, but it always gets complicated quickly. Composition can be easily modeled through a foreign-key relationship between two tables, but inheritance is much harder.
While in short words I would agree with "Prefer composition over inheritance", very often for me it sounds like "prefer potatoes over coca-cola". There are places for inheritance and places for composition. You need to understand difference, then this question will disappear. What it really means for me is "if you are going to use inheritance - think again, chances are you need composition".
You should prefer potatoes over coca cola when you want to eat, and coca cola over potatoes when you want to drink.
Creating a subclass should mean more than just a convenient way to call superclass methods. You should use inheritance when subclass "is-a" super class both structurally and functionally, when it can be used as superclass and you are going to use that. If it is not the case - it is not inheritance, but something else. Composition is when your objects consists of another, or has some relationship to them.
So for me it looks like if someone does not know if he needs inheritance or composition, the real problem is that he does not know if he want to drink or to eat. Think about your problem domain more, understand it better.
Didn't find a satisfactory answer here, so I wrote a new one.
To understand why "prefer composition over inheritance", we need first get back the assumption omitted in this shortened idiom.
There are two benefits of inheritance: subtyping and subclassing
Subtyping means conforming to a type (interface) signature, i.e. a set of APIs, and one can override part of the signature to achieve subtyping polymorphism.
Subclassing means implicit reuse of method implementations.
With the two benefits comes two different purposes for doing inheritance: subtyping oriented and code reuse oriented.
If code reuse is the sole purpose, subclassing may give one more than what he needs, i.e. some public methods of the parent class don't make much sense for the child class. In this case, instead of favoring composition over inheritance, composition is demanded. This is also where the "is-a" vs. "has-a" notion comes from.
So only when subtyping is purposed, i.e. to use the new class later in a polymorphic manner, do we face the problem of choosing inheritance or composition. This is the assumption that gets omitted in the shortened idiom under discussion.
To subtype is to conform to a type signature, this means composition has always to expose no less amount of APIs of the type. Now the trade offs kick in:
Inheritance provides straightforward code reuse if not overridden, while composition has to re-code every API, even if it's just a simple job of delegation.
Inheritance provides straightforward open recursion via the internal polymorphic site this, i.e. invoking overriding method (or even type) in another member function, either public or private (though discouraged). Open recursion can be simulated via composition, but it requires extra effort and may not always viable(?). This answer to a duplicated question talks something similar.
Inheritance exposes protected members. This breaks encapsulation of the parent class, and if used by subclass, another dependency between the child and its parent is introduced.
Composition has the befit of inversion of control, and its dependency can be injected dynamically, as is shown in decorator pattern and proxy pattern.
Composition has the benefit of combinator-oriented programming, i.e. working in a way like the composite pattern.
Composition immediately follows programming to an interface.
Composition has the benefit of easy multiple inheritance.
With the above trade offs in mind, we hence prefer composition over inheritance. Yet for tightly related classes, i.e. when implicit code reuse really make benefits, or the magic power of open recursion is desired, inheritance shall be the choice.
Inheritance is pretty enticing especially coming from procedural-land and it often looks deceptively elegant. I mean all I need to do is add this one bit of functionality to some other class, right? Well, one of the problems is that inheritance is probably the worst form of coupling you can have
Your base class breaks encapsulation by exposing implementation details to subclasses in the form of protected members. This makes your system rigid and fragile. The more tragic flaw however is the new subclass brings with it all the baggage and opinion of the inheritance chain.
The article, Inheritance is Evil: The Epic Fail of the DataAnnotationsModelBinder, walks through an example of this in C#. It shows the use of inheritance when composition should have been used and how it could be refactored.
When can you use composition?
You can always use composition. In some cases, inheritance is also possible and may lead to a more powerful and/or intuitive API, but composition is always an option.
When can you use inheritance?
It is often said that if "a bar is a foo", then the class Bar can inherit the class Foo. Unfortunately, this test alone is not reliable, use the following instead:
a bar is a foo, AND
bars can do everything that foos can do.
The first test ensures that all getters of Foo make sense in Bar (= shared properties), while the second test makes sure that all setters of Foo make sense in Bar (= shared functionality).
Example: Dog/Animal
A dog is an animal AND dogs can do everything that animals can do (such as breathing, moving, etc.). Therefore, the class Dog can inherit the class Animal.
Counter-example: Circle/Ellipse
A circle is an ellipse BUT circles can't do everything that ellipses can do. For example, circles can't stretch, while ellipses can. Therefore, the class Circle cannot inherit the class Ellipse.
This is called the Circle-Ellipse problem, which isn't really a problem, but more an indication that "a bar is a foo" isn't a reliable test by itself. In particular, this example highlights that derived classes should extend the functionality of base classes, never restrict it. Otherwise, the base class couldn't be used polymorphically. Adding the test "bars can do everything that foos can do" ensures that polymorphic use is possible, and is equivalent to the Liskov Substitution Principle:
Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it
When should you use inheritance?
Even if you can use inheritance doesn't mean you should: using composition is always an option. Inheritance is a powerful tool allowing implicit code reuse and dynamic dispatch, but it does come with a few disadvantages, which is why composition is often preferred. The trade-offs between inheritance and composition aren't obvious, and in my opinion are best explained in lcn's answer.
As a rule of thumb, I tend to choose inheritance over composition when polymorphic use is expected to be very common, in which case the power of dynamic dispatch can lead to a much more readable and elegant API. For example, having a polymorphic class Widget in GUI frameworks, or a polymorphic class Node in XML libraries allows to have an API which is much more readable and intuitive to use than what you would have with a solution purely based on composition.
In Java or C#, an object cannot change its type once it has been instantiated.
So, if your object need to appear as a different object or behave differently depending on an object state or conditions, then use Composition: Refer to State and Strategy Design Patterns.
If the object need to be of the same type, then use Inheritance or implement interfaces.
Personally I learned to always prefer composition over inheritance. There is no programmatic problem you can solve with inheritance which you cannot solve with composition; though you may have to use Interfaces(Java) or Protocols(Obj-C) in some cases. Since C++ doesn't know any such thing, you'll have to use abstract base classes, which means you cannot get entirely rid of inheritance in C++.
Composition is often more logical, it provides better abstraction, better encapsulation, better code reuse (especially in very large projects) and is less likely to break anything at a distance just because you made an isolated change anywhere in your code. It also makes it easier to uphold the "Single Responsibility Principle", which is often summarized as "There should never be more than one reason for a class to change.", and it means that every class exists for a specific purpose and it should only have methods that are directly related to its purpose. Also having a very shallow inheritance tree makes it much easier to keep the overview even when your project starts to get really large. Many people think that inheritance represents our real world pretty well, but that isn't the truth. The real world uses much more composition than inheritance. Pretty much every real world object you can hold in your hand has been composed out of other, smaller real world objects.
There are downsides of composition, though. If you skip inheritance altogether and only focus on composition, you will notice that you often have to write a couple of extra code lines that weren't necessary if you had used inheritance. You are also sometimes forced to repeat yourself and this violates the DRY Principle (DRY = Don't Repeat Yourself). Also composition often requires delegation, and a method is just calling another method of another object with no other code surrounding this call. Such "double method calls" (which may easily extend to triple or quadruple method calls and even farther than that) have much worse performance than inheritance, where you simply inherit a method of your parent. Calling an inherited method may be equally fast as calling a non-inherited one, or it may be slightly slower, but is usually still faster than two consecutive method calls.
You may have noticed that most OO languages don't allow multiple inheritance. While there are a couple of cases where multiple inheritance can really buy you something, but those are rather exceptions than the rule. Whenever you run into a situation where you think "multiple inheritance would be a really cool feature to solve this problem", you are usually at a point where you should re-think inheritance altogether, since even it may require a couple of extra code lines, a solution based on composition will usually turn out to be much more elegant, flexible and future proof.
Inheritance is really a cool feature, but I'm afraid it has been overused the last couple of years. People treated inheritance as the one hammer that can nail it all, regardless if it was actually a nail, a screw, or maybe a something completely different.
My general rule of thumb: Before using inheritance, consider if composition makes more sense.
Reason: Subclassing usually means more complexity and connectedness, i.e. harder to change, maintain, and scale without making mistakes.
A much more complete and concrete answer from Tim Boudreau of Sun:
Common problems to the use of inheritance as I see it are:
Innocent acts can have unexpected results - The classic example of this is calls to overridable methods from the superclass
constructor, before the subclasses instance fields have been
initialized. In a perfect world, nobody would ever do that. This is
not a perfect world.
It offers perverse temptations for subclassers to make assumptions about order of method calls and such - such assumptions tend not to
be stable if the superclass may evolve over time. See also my toaster
and coffee pot analogy.
Classes get heavier - you don't necessarily know what work your superclass is doing in its constructor, or how much memory it's going
to use. So constructing some innocent would-be lightweight object can
be far more expensive than you think, and this may change over time if
the superclass evolves
It encourages an explosion of subclasses. Classloading costs time, more classes costs memory. This may be a non-issue until you're
dealing with an app on the scale of NetBeans, but there, we had real
issues with, for example, menus being slow because the first display
of a menu triggered massive class loading. We fixed this by moving to
more declarative syntax and other techniques, but that cost time to
fix as well.
It makes it harder to change things later - if you've made a class public, swapping the superclass is going to break subclasses -
it's a choice which, once you've made the code public, you're married
to. So if you're not altering the real functionality to your
superclass, you get much more freedom to change things later if you
use, rather than extend the thing you need. Take, for example,
subclassing JPanel - this is usually wrong; and if the subclass is
public somewhere, you never get a chance to revisit that decision. If
it's accessed as JComponent getThePanel() , you can still do it (hint:
expose models for the components within as your API).
Object hierarchies don't scale (or making them scale later is much harder than planning ahead) - this is the classic "too many layers"
problem. I'll go into this below, and how the AskTheOracle pattern can
solve it (though it may offend OOP purists).
...
My take on what to do, if you do allow for inheritance, which you may
take with a grain of salt is:
Expose no fields, ever, except constants
Methods shall be either abstract or final
Call no methods from the superclass constructor
...
all of this applies less to small projects than large ones, and less
to private classes than public ones
Inheritance is very powerful, but you can't force it (see: the circle-ellipse problem). If you really can't be completely sure of a true "is-a" subtype relationship, then it's best to go with composition.
Inheritance creates a strong relationship between a subclass and super class; subclass must be aware of super class'es implementation details. Creating the super class is much harder, when you have to think about how it can be extended. You have to document class invariants carefully, and state what other methods overridable methods use internally.
Inheritance is sometimes useful, if the hierarchy really represents a is-a-relationship. It relates to Open-Closed Principle, which states that classes should be closed for modification but open to extension. That way you can have polymorphism; to have a generic method that deals with super type and its methods, but via dynamic dispatch the method of subclass is invoked. This is flexible, and helps to create indirection, which is essential in software (to know less about implementation details).
Inheritance is easily overused, though, and creates additional complexity, with hard dependencies between classes. Also understanding what happens during execution of a program gets pretty hard due to layers and dynamic selection of method calls.
I would suggest using composing as the default. It is more modular, and gives the benefit of late binding (you can change the component dynamically). Also it's easier to test the things separately. And if you need to use a method from a class, you are not forced to be of certain form (Liskov Substitution Principle).
Suppose an aircraft has only two parts: an engine and wings.
Then there are two ways to design an aircraft class.
Class Aircraft extends Engine{
var wings;
}
Now your aircraft can start with having fixed wings
and change them to rotary wings on the fly. It's essentially
an engine with wings. But what if I wanted to change
the engine on the fly as well?
Either the base class Engine exposes a mutator to change its
properties, or I redesign Aircraft as:
Class Aircraft {
var wings;
var engine;
}
Now, I can replace my engine on the fly as well.
If you want the canonical, textbook answer people have been giving since the rise of OOP (which you see many people giving in these answers), then apply the following rule: "if you have an is-a relationship, use inheritance. If you have a has-a relationship, use composition".
This is the traditional advice, and if that satisfies you, you can stop reading here and go on your merry way. For everyone else...
is-a/has-a comparisons have problems
For example:
A square is-a rectangle, but if your rectangle class has setWidth()/setHeight() methods, then there's no reasonable way to make a Square inherit from Rectangle without breaking Liskov's substitution principle.
An is-a relationship can often be rephrased to sound like a has-a relationship. For example, an employee is-a person, but a person also has-an employment status of "employed".
is-a relationships can lead to nasty multiple inheritance hierarchies if you're not careful. After all, there's no rule in English that states that an object is exactly one thing.
People are quick to pass this "rule" around, but has anyone ever tried to back it up, or explain why it's a good heuristic to follow? Sure, it fits nicely into the idea that OOP is supposed to model the real world, but that's not in-and-of-itself a reason to adopt a principle.
See this StackOverflow question for more reading on this subject.
To know when to use inheritance vs composition, we first need to understand the pros and cons of each.
The problems with implementation inheritance
Other answers have done a wonderful job at explaining the issues with inheritance, so I'll try to not delve into too many details here. But, here's a brief list:
It can be difficult to follow a logic that weaves between base and sub-class methods.
Carelessly implementing one method in your class by calling another overridable method will cause you to leak implementation details and break encapsulation, as the end-user could override your method and detect when you internally call it. (See "Effective Java" item 18).
The fragile base problem, which simply states that your end-user's code will break if they happen to depend on the leakage of implementation details when you attempt to change them. To make matters worse, most OOP languages allow inheritance by default - API designers who aren't proactively preventing people from inheriting from their public classes need to be extra cautious whenever they refactor their base classes. Unfortunately, the fragile base problem is often misunderstood, causing many to not understand what it takes to maintain a class that anyone can inherit from.
The deadly diamond of death
The problems with composition
It can sometimes be a little verbose.
That's it. I'm serious. This is still a real issue and can sometimes create conflict with the DRY principle, but it's generally not that bad, at least compared to the myriad of pitfalls associated with inheritance.
When should inheritance be used?
Next time you're drawing out your fancy UML diagrams for a project (if you do that), and you're thinking about adding in some inheritance, please adhere to the following advice: don't.
At least, not yet.
Inheritance is sold as a tool to achieve polymorphism, but bundled with it is this powerful code-reuse system, that frankly, most code doesn't need. The problem is, as soon as you publicly expose your inheritance hierarchy, you're locked into this particular style of code-reuse, even if it's overkill to solve your particular problem.
To avoid this, my two cents would be to never expose your base classes publicly.
If you need polymorphism, use an interface.
If you need to allow people to customize the behavior of your class, provide explicit hook-in points via the strategy pattern, it's a more readable way to accomplish this, plus, it's easier to keep this sort of API stable as you're in full control over what behaviors they can and can not change.
If you're trying to follow the open-closed principle by using inheritance to avoid adding a much-needed update to a class, just don't. Update the class. Your codebase will be much cleaner if you actually take ownership of the code you're hired to maintain instead of trying to tack stuff onto the side of it. If you're scared about introducing bugs, then get the existing code under test.
If you need to reuse code, start out by trying to use composition or helper functions.
Finally, if you've decided that there's no other good option, and you must use inheritance to achieve the code-reuse that you need, then you can use it, but, follow these four P.A.I.L. rules of restricted inheritance to keep it sane.
Use inheritance as a private implementation detail. Don't expose your base class publicly, use interfaces for that. This lets you freely add or remove inheritance as you see fit without making a breaking change.
Keep your base class abstract. It makes it easier to divide out the logic that needs to be shared from the logic that doesn't.
Isolate your base and child classes. Don't let your subclass override base class methods (use the strategy pattern for that), and avoid having them expect properties/methods to exist on each other, use other forms of code-sharing to achieve that. Use appropriate language features to force all methods on the base class to be non-overridable ("final" in Java, or non-virtual in C#).
Inheritance is a last resort.
The Isolate rule in particular may sound a little rough to follow, but if you discipline yourself, you'll get some pretty nice benefits. In particular, it gives you the freedom to avoid all of the main nasty pitfalls associated with the inheritance that were mentioned above.
It's much easier to follow the code because it doesn't weave in and out of base/sub classes.
You can not accidentally leak when your methods are internally calling other overridable methods if you never make any of your methods overridable. In other words, you won't accidentally break encapsulation.
The fragile base class problem stems from the ability to depend on accidentally leaked implementation details. Since the base class is now isolated, it will be no more fragile than a class depending on another via composition.
The deadly diamond of death isn't an issue anymore, since there's simply no need to have multiple layers of inheritance. If you have the abstract base classes B and C, which both share a lot of functionality, just move that functionality out of B and C and into a new abstract base class, class D. Anyone who inherited from B should update to inherit from both B and D, and anyone who inherited from C should inherit from C and D. Since your base classes are all private implementation details, it shouldn't be too difficult to figure out who's inheriting from what, to make these changes.
Conclusion
My primary suggestion would be to use your brain on this matter. What's far more important than a list of dos and don'ts about when to use inheritance is an intuitive understanding of inheritance and its associated pros and cons, along with a good understanding of the other tools out there that can be used instead of inheritance (composition isn't the only alternative. For example, the strategy pattern is an amazing tool that's forgotten far too often). Perhaps when you have a good, solid understanding of all of these tools, you'll choose to use inheritance more often than I would recommend, and that's completely fine. At least, you're making an informed decision, and aren't just using inheritance because that's the only way you know how to do it.
Further reading:
An article I wrote on this subject, that dives even deeper and provides examples.
A webpage talking about three different jobs that inheritance does, and how those jobs can be done via other means in the Go language.
A list of reasons why it can be good to declare your class as non-inheritable (e.g. "final" in Java).
The "Effective Java" book by Joshua Bloch, item 18, which discusses composition over inheritance, and some of the dangers of inheritance.
You need to have a look at The Liskov Substitution Principle in Uncle Bob's SOLID principles of class design. :)
To address this question from a different perspective for newer programmers:
Inheritance is often taught early when we learn object-oriented programming, so it's seen as an easy solution to a common problem.
I have three classes that all need some common functionality. So if I
write a base class and have them all inherit from it, then they will
all have that functionality and I'll only need to maintain it in once
place.
It sounds great, but in practice it almost never, ever works, for one of several reasons:
We discover that there are some other functions that we want our classes to have. If the way that we add functionality to classes is through inheritance, we have to decide - do we add it to the existing base class, even though not every class that inherits from it needs that functionality? Do we create another base class? But what about classes that already inherit from the other base class?
We discover that for just one of the classes that inherits from our base class we want the base class to behave a little differently. So now we go back and tinker with our base class, maybe adding some virtual methods, or even worse, some code that says, "If I'm inherited type A, do this, but if I'm inherited type B, do that." That's bad for lots of reasons. One is that every time we change the base class, we're effectively changing every inherited class. So we're really changing class A, B, C, and D because we need a slightly different behavior in class A. As careful as we think we are, we might break one of those classes for reasons that have nothing to do with those classes.
We might know why we decided to make all of these classes inherit from each other, but it might not (probably won't) make sense to someone else who has to maintain our code. We might force them into a difficult choice - do I do something really ugly and messy to make the change I need (see the previous bullet point) or do I just rewrite a bunch of this.
In the end, we tie our code in some difficult knots and get no benefit whatsoever from it except that we get to say, "Cool, I learned about inheritance and now I used it." That's not meant to be condescending because we've all done it. But we all did it because no one told us not to.
As soon as someone explained "favor composition over inheritance" to me, I thought back over every time I tried to share functionality between classes using inheritance and realized that most of the time it didn't really work well.
The antidote is the Single Responsibility Principle. Think of it as a constraint. My class must do one thing. I must be able to give my class a name that somehow describes that one thing it does. (There are exceptions to everything, but absolute rules are sometimes better when we're learning.) It follows that I cannot write a base class called ObjectBaseThatContainsVariousFunctionsNeededByDifferentClasses. Whatever distinct functionality I need must be in its own class, and then other classes that need that functionality can depend on that class, not inherit from it.
At the risk of oversimplifying, that's composition - composing multiple classes to work together. And once we form that habit we find that it's much more flexible, maintainable, and testable than using inheritance.
When you want to "copy"/Expose the base class' API, you use inheritance. When you only want to "copy" functionality, use delegation.
One example of this: You want to create a Stack out of a List. Stack only has pop, push and peek. You shouldn't use inheritance given that you don't want push_back, push_front, removeAt, et al.-kind of functionality in a Stack.
These two ways can live together just fine and actually support each other.
Composition is just playing it modular: you create interface similar to the parent class, create new object and delegate calls to it. If these objects need not to know of each other, it's quite safe and easy to use composition. There are so many possibilites here.
However, if the parent class for some reason needs to access functions provided by the "child class" for inexperienced programmer it may look like it's a great place to use inheritance. The parent class can just call it's own abstract "foo()" which is overwritten by the subclass and then it can give the value to the abstract base.
It looks like a nice idea, but in many cases it's better just give the class an object which implements the foo() (or even set the value provided the foo() manually) than to inherit the new class from some base class which requires the function foo() to be specified.
Why?
Because inheritance is a poor way of moving information.
The composition has a real edge here: the relationship can be reversed: the "parent class" or "abstract worker" can aggregate any specific "child" objects implementing certain interface + any child can be set inside any other type of parent, which accepts it's type. And there can be any number of objects, for example MergeSort or QuickSort could sort any list of objects implementing an abstract Compare -interface. Or to put it another way: any group of objects which implement "foo()" and other group of objects which can make use of objects having "foo()" can play together.
I can think of three real reasons for using inheritance:
You have many classes with same interface and you want to save time writing them
You have to use same Base Class for each object
You need to modify the private variables, which can not be public in any case
If these are true, then it is probably necessary to use inheritance.
There is nothing bad in using reason 1, it is very good thing to have a solid interface on your objects. This can be done using composition or with inheritance, no problem - if this interface is simple and does not change. Usually inheritance is quite effective here.
If the reason is number 2 it gets a bit tricky. Do you really only need to use the same base class? In general, just using the same base class is not good enough, but it may be a requirement of your framework, a design consideration which can not be avoided.
However, if you want to use the private variables, the case 3, then you may be in trouble. If you consider global variables unsafe, then you should consider using inheritance to get access to private variables also unsafe. Mind you, global variables are not all THAT bad - databases are essentially big set of global variables. But if you can handle it, then it's quite fine.
Aside from is a/has a considerations, one must also consider the "depth" of inheritance your object has to go through. Anything beyond five or six levels of inheritance deep might cause unexpected casting and boxing/unboxing problems, and in those cases it might be wise to compose your object instead.
When you have an is-a relation between two classes (example dog is a canine), you go for inheritance.
On the other hand when you have has-a or some adjective relationship between two classes (student has courses) or (teacher studies courses), you chose composition.
A simple way to make sense of this would be that inheritance should be used when you need an object of your class to have the same interface as its parent class, so that it can thereby be treated as an object of the parent class (upcasting). Moreover, function calls on a derived class object would remain the same everywhere in code, but the specific method to call would be determined at runtime (i.e. the low-level implementation differs, the high-level interface remains the same).
Composition should be used when you do not need the new class to have the same interface, i.e. you wish to conceal certain aspects of the class' implementation which the user of that class need not know about. So composition is more in the way of supporting encapsulation (i.e. concealing the implementation) while inheritance is meant to support abstraction (i.e. providing a simplified representation of something, in this case the same interface for a range of types with different internals).
Subtyping is appropriate and more powerful where the invariants can be enumerated, else use function composition for extensibility.
I agree with #Pavel, when he says, there are places for composition and there are places for inheritance.
I think inheritance should be used if your answer is an affirmative to any of these questions.
Is your class part of a structure that benefits from polymorphism ? For example, if you had a Shape class, which declares a method called draw(), then we clearly need Circle and Square classes to be subclasses of Shape, so that their client classes would depend on Shape and not on specific subclasses.
Does your class need to re-use any high level interactions defined in another class ? The template method design pattern would be impossible to implement without inheritance. I believe all extensible frameworks use this pattern.
However, if your intention is purely that of code re-use, then composition most likely is a better design choice.
Inheritance is a very powerfull machanism for code reuse. But needs to be used properly. I would say that inheritance is used correctly if the subclass is also a subtype of the parent class. As mentioned above, the Liskov Substitution Principle is the key point here.
Subclass is not the same as subtype. You might create subclasses that are not subtypes (and this is when you should use composition). To understand what a subtype is, lets start giving an explanation of what a type is.
When we say that the number 5 is of type integer, we are stating that 5 belongs to a set of possible values (as an example, see the possible values for the Java primitive types). We are also stating that there is a valid set of methods I can perform on the value like addition and subtraction. And finally we are stating that there are a set of properties that are always satisfied, for example, if I add the values 3 and 5, I will get 8 as a result.
To give another example, think about the abstract data types, Set of integers and List of integers, the values they can hold are restricted to integers. They both support a set of methods, like add(newValue) and size(). And they both have different properties (class invariant), Sets does not allow duplicates while List does allow duplicates (of course there are other properties that they both satisfy).
Subtype is also a type, which has a relation to another type, called parent type (or supertype). The subtype must satisfy the features (values, methods and properties) of the parent type. The relation means that in any context where the supertype is expected, it can be substitutable by a subtype, without affecting the behaviour of the execution. Let’s go to see some code to exemplify what I’m saying. Suppose I write a List of integers (in some sort of pseudo language):
class List {
data = new Array();
Integer size() {
return data.length;
}
add(Integer anInteger) {
data[data.length] = anInteger;
}
}
Then, I write the Set of integers as a subclass of the List of integers:
class Set, inheriting from: List {
add(Integer anInteger) {
if (data.notContains(anInteger)) {
super.add(anInteger);
}
}
}
Our Set of integers class is a subclass of List of Integers, but is not a subtype, due to it is not satisfying all the features of the List class. The values, and the signature of the methods are satisfied but the properties are not. The behaviour of the add(Integer) method has been clearly changed, not preserving the properties of the parent type. Think from the point of view of the client of your classes. They might receive a Set of integers where a List of integers is expected. The client might want to add a value and get that value added to the List even if that value already exist in the List. But her wont get that behaviour if the value exists. A big suprise for her!
This is a classic example of an improper use of inheritance. Use composition in this case.
(a fragment from: use inheritance properly).
Even though Composition is preferred, I would like to highlight pros of Inheritance and cons of Composition.
Pros of Inheritance:
It establishes a logical "IS A" relation. If Car and Truck are two types of Vehicle ( base class), child class IS A base class.
i.e.
Car is a Vehicle
Truck is a Vehicle
With inheritance, you can define/modify/extend a capability
Base class provides no implementation and sub-class has to override complete method (abstract) => You can implement a contract
Base class provides default implementation and sub-class can change the behaviour => You can re-define contract
Sub-class adds extension to base class implementation by calling super.methodName() as first statement => You can extend a contract
Base class defines structure of the algorithm and sub-class will override a part of algorithm => You can implement Template_method without change in base class skeleton
Cons of Composition:
In inheritance, subclass can directly invoke base class method even though it's not implementing base class method because of IS A relation. If you use composition, you have to add methods in container class to expose contained class API
e.g. If Car contains Vehicle and if you have to get price of the Car, which has been defined in Vehicle, your code will be like this
class Vehicle{
protected double getPrice(){
// return price
}
}
class Car{
Vehicle vehicle;
protected double getPrice(){
return vehicle.getPrice();
}
}
A rule of thumb I have heard is inheritance should be used when its a "is-a" relationship and composition when its a "has-a". Even with that I feel that you should always lean towards composition because it eliminates a lot of complexity.
As many people told, I will first start with the check - whether there exists an "is-a" relationship. If it exists I usually check the following:
Whether the base class can be instantiated. That is, whether the base class can be non-abstract. If it can be non-abstract I usually prefer composition
E.g 1. Accountant is an Employee. But I will not use inheritance because a Employee object can be instantiated.
E.g 2. Book is a SellingItem. A SellingItem cannot be instantiated - it is abstract concept. Hence I will use inheritacne. The SellingItem is an abstract base class (or interface in C#)
What do you think about this approach?
Also, I support #anon answer in Why use inheritance at all?
The main reason for using inheritance is not as a form of composition - it is so you can get polymorphic behaviour. If you don't need polymorphism, you probably should not be using inheritance.
#MatthieuM. says in https://softwareengineering.stackexchange.com/questions/12439/code-smell-inheritance-abuse/12448#comment303759_12448
The issue with inheritance is that it can be used for two orthogonal purposes:
interface (for polymorphism)
implementation (for code reuse)
REFERENCE
Which class design is better?
Inheritance vs. Aggregation
Composition v/s Inheritance is a wide subject. There is no real answer for what is better as I think it all depends on the design of the system.
Generally type of relationship between object provide better information to choose one of them.
If relation type is "IS-A" relation then Inheritance is better approach.
otherwise relation type is "HAS-A" relation then composition will better approach.
Its totally depend on entity relationship.
Related
I often seem to run into the discussion of whether or not to apply some sort of prefix/suffix convention to interface type names, typically adding "I" to the beginning of the name.
Personally I'm in the camp that advocates no prefix, but that's not what this question is about. Rather, it's about one of the arguments I often hear in that discussion:
You can no longer see at-a-glance
whether something is an interface or a
class.
The question that immediately pops up in my head is: apart from object creation, why should you ever have to care whether an object reference is a class or an interface?
I've tagged this question as language agnostic, but as has been pointed out it may not be. I contend that it is because while specific language implementation details may be interesting, I'd like to keep this on a conceptual level. In other words, I think that, conceptually, you'd never have to care whether an object reference is typed as a class or an interface but I'm not sure, hence the question.
This is not a discussion about IDEs and what they do or don't do when visualizing the different types; caring about the type of an object is certainly a necessity when browsing through code (packages/sources/whatever form). Nor is it a discussion about the pros or cons about either naming convention. I just can't seem to figure out in what scenario, other than object creation, you actually care about wether or not you're referencing a concrete type or an interface.
Most of the time, you probably don't care. But here are some instances that I can think of where you would. There are several, and it does vary a little bit by language. Some languages don't mind as much as others.
In the case of inversion of control (where someone PASSES you a parameter) you probably don't care if it's an interface or an object as far as calling its methods etc. But when dealing with types, it definitely can make a difference.
In managed languages such as .NET languages, interfaces can usually only inherit one interface, whereas a class can inherit one class but implement many interfaces. The order of classes vs interfaces may also matter in a class or interface declaration. So you need to know which is which when defining a new class or interface.
In Delphi / VCL, interfaces are reference counted and automatically collected, whereas classes must be explicitly freed, so lifecyle management on the whole is affected, not just the creation.
Interfaces may not be viable sources for class references.
Interfaces can be cast to compatible interfaces, but in many languages, they cannot be cast to compatible classes. Classes can be cast to either.
Interfaces may be passed to parameters of type IID, or IUnknown, whereas classes cannot (without a cast and a supporting interface).
An interface's implementation is unknown. Its input and output are defined, but the implementation which creates the output is abstracted. In general, ones attitude may be that when working with a class, one may know how the class works. But when working with an interface, no such assumption should be made. In a perfect world, it might make no difference. But in reality, this most certainly can have affect your design.
I agree with you (and thereby do not use an "I" prefix for interfaces). We shouldn't have to care whether it is an abstract class or an interface.
Worth noting that Java needs to have a notion of interface solely because it does not support multiple inheritance. Otherwise, "abstract class" concept would suffice (which may be "all" abstract, or partially abstract, or almost concrete and just 1 tiny bit abstract, whatever).
Things that concrete class can have and the interfaces can't:
Constructors
Instance fields
Static methods and static fields
So if you use the convention of starting all interface names with 'I' then it indicates to the user of your library that the particular type will not have any of the above mentioned things.
But personally I feel that this is not a reason enough to start all interface names with 'I'. The modern IDEs are powerful enough to indicate if some type is an interface. Also it hides the true meaning of an interface name: imagine if Runnable and List interfaces were named IRunnable and IList repectively.
When a class is used, I can make the assumption that I will get objects from a relatively small and almost well-defined range of subclasses. That's because subclassing is - or at least it should be
- a decision that isn't made too easily, especially in languages that don't support multiple inheritance. In contrast, interfaces can be implemented by any class, and the implementation can be added later to any class.
So the information is useful, especially when browsing through code, and trying to get a feeling what the code author intended to do - but I think it should be enough, if the IDE shows interfaces/classes as distinctive icons.
You want to see at a glance which are the "interfaces" and which are the "concrete classes" so that you can focus your attention to the abstractions in the design instead of the details.
Good designs are based on abstractions - if you know and understand them you understand the system without knowing any of the details. So you know you can skip the classes without the I prefix, and focus on the ones that do have it while you are understanding the code, and you also know to avoid building new code around non-interface classes without having to refer to some other design document.
I agree that the I* naming convention is just not appropriate for modern OO languages, but truth is this question isn't really language agnostic. There are legitimate cases where you have an interface not for any architectural reason but because you simply don't have an implementation or have access to an implementation. For these cases you can read I* as *Stub or similar, and, in these cases, it might make sense to have an IBlah and a Blah class
These days, though, you rarely come up against this, and in modern OO languages when you say Interface you actually mean Interface not just I don't have the code for this. So there is no need for the I*, and in fact it encourages really bad OO design as you won't get the natural naming conflicts that would tell you something's gone wrong in your architecture. Say you had a List and an IList... what's the difference? when would you use one over the other? if you wanted to implement IList would you be constrained (conceptually at least) by what List does? I'll tell you what... if I found both an IBlah and a Blah class in any of my codebases I would purge one at random and take away that person's commit privileges.
Interfaces don't have fields, hence when you use IDisposable (or whatever), you know you're only declaring what you can do. That seems to me the main point of it.
Distinguishing between interfaces and classes may be useful, anywhere the type is referenced, in the IDE or out, to determine:
Can I make a new implementation of this type?
Can I implement this interface in a language that does not support multiple inheritance of implementation classes (e.g., Java).
Can there be multiple implementations of this type?
Can I easily mock this interface in an arbitrary mocking framework?
It is worth noting that UML distinguishes between interfaces and implementation classes. In addition, the "I" prefix is used in the examples in "The Unified Modeling Language User Guide" by the three amigos Booch, Jacobson and Rumbaugh. (Incidentally, this also provides an example why IDE syntax coloring alone is not sufficient to distinguish in all contexts.)
You should care, because :
An interface with capital "I" enables one, namely you or your co-workers to use any implementation which implements the interface. If in the future you figure out a better way to do something, say a better list sorting algorithm, you will be stuck with having the change ALL of the invoking methods as well.
It helps in understanding code - e.g. you don't need to memorize all 10 implementations of say, I_SortableList , you just care that it sorts a list (or something like that). Your code becomes practically self-documenting here.
To complete the discussion, here is a pseudocode example illustrating the above:
//Pseudocode - define implementations of ISortableList
Class SortList1 : ISortableLIst, SortList2:IsortableList, SortList3:IsortableList
//PseudoCode - the interface way
void Populate(ISortableList list, int[] nums)
{
list.set(nums)
}
//PseudoCode - the "i dont care way"
void Populate2( SortList1 list, int[] nums )
{
list.set(nums)
}
...
//Pseudocode - create instances
SortList1 list1 = new SortList1();
SortList2 list2 = new SortList2();
SortList3 list3 = new SortList3();
//Invoke Populate() - The "interface way"
Populate(list1,nums);//OK, list1 is ISortableList implementation
Populate(list2,nums);//OK, list2 is ISortableList implementation
Populate(list3,nums);//OK, list3 is ISortableList implementation
//Invoke Populate2() - the "I don't care way"
Populate(list1,nums);//OK, list1 is an instance of SortList1
Populate(list2,nums);//Not OK, list2 is not of required argument type, won't compile
Populate(list3,nums);//the same as above
Hope this helps,
Jas.
Here is the problem statement: Calling a setter on the object should result in the object to change to an object of a different class, which language can support this?
Ex. I have a class called "Man" (Parent Class), and two children namely "Toddler" and "Old Man", they are its children because they override a behaviour in Man called as walk. ( i.e Toddler sometimes walks using both his hands and legs kneeled down and the Old man uses a stick to support himself).
The Man class has a attribute called age, I have a setter on Man, say setAge(int ageValue). I have 3 objects, 2 toddlers, 1 old-Man. (The system is up and running, I guess when we say objects it is obvious). I will make this call, toddler.setAge(80), I expect the toddler to change to an object of type Old Man. Is this possible? Please suggest.
Thanks,
This sounds to me like the model is wrong. What you have is a Person whose relative temporal grouping and some specific behavior changes with age.
Perhaps you need a method named getAgeGroup() which returns an appropriate Enum, depending on what the current age is. You also need an internal state object which encapsulates the state-specific behavior to which your Person delegates behavior which changes with age.
That said, changing the type of an instantiated object dynamically will likely only be doable only with dynamically typed languages; certainly it's not doable in Java, and probably not doable in C# and most other statically typed languages.
This is a common problem that you can solve using combination of OO modelling and design patterns.
You will model the class the way you have where Toddler and OldMan inherit from Man base class. You will need to introduce a Proxy (see GoF design pattern) class as your access to your Man class. Internally, proxy hold a man object/pointer/reference to either Toddler or OldMan. The proxy will expose all the interfaces that is exposed by Man class so that you can use it as it is and in your scenario, you will implement setAge similar to the pseudo code below:
public void setAge(int age)
{
if( age > TODDLER_MAX && myMan is Toddler)
myMan = new OldMan();
else
.....
myMan.setAge(age);
}
If your language does not support changing the classtype at runtime, take a look at the decorator and strategy patterns.
Objects in Python can change their class by setting the __class__ attribute. Otherwise, use the Strategy pattern.
I wonder if subclassing is really the best solution here. A property (enum, probably) that has different types of people as its possible values is one alternative. Or, for that matter, a derived property or method that tells you the type of person based on the age.
Javascript can do this. At any time you can take an existing object and add new methods to it, or change its existing methods. This can be done at the individual object level.
Douglas Crockford writes about this in Classical Inheritance in JavaScript:
Class Augmentation
JavaScript's dynamism allows us to add
or replace methods of an existing
class. We can call the method method
at any time, and all present and
future instances of the class will
have that method. We can literally
extend a class at any time.
Inheritance works retroactively. We
call this Class Augmentation to avoid
confusion with Java's extends, which
means something else.
Object Augmentation
In the static object-oriented
languages, if you want an object which
is slightly different than another
object, you need to define a new
class. In JavaScript, you can add
methods to individual objects without
the need for additional classes. This
has enormous power because you can
write far fewer classes and the
classes you do write can be much
simpler. Recall that JavaScript
objects are like hashtables. You
can add new values at any time. If the
value is a function, then it becomes a
method.
Common Lisp can: use the generic function CHANGE-CLASS.
I am surprised no one so far seemed to notice that this is the exact case for the State design pattern (although #Fadrian in fact described the core idea of the pattern quite precisely - without mentioning its name).
The state pattern is a behavioral software design pattern, also known as
the objects for states pattern. This pattern is used in computer
programming to represent the state of an object. This is a clean way for an
object to partially change its type at runtime.
The referenced page gives examples in Java and Python. Obviously it can be implemented in other strongly typed languages as well. (OTOH weakly typed languages have no need for State, as these support such behaviour out of the box.)
Although I'm coding in ObjC, This question is intentionally language-agnostic - it should apply to most OO languages
Let's say I have an "Collection" class, and I want to create a "FilteredCollection" that inherits from "Collection". Filters will be set up at object-creation time, and from them on, the class will behave like a "Collection" with the filters applied to its contents.
I do things the obvious way and subclass Collection. I override all the accessors, and think I've done a pretty neat job - my FilteredCollection looks like it should behave just like a Collection, but with objects that are 'in' it that correspond to my filters being filtered out to users. I think I can happily create FilteredCollections and pass them around my program as Collections.
But I come to testing and - oh no - it's not working. Delving into the debugger, I find that it's because the Collection implementation of some methods is calling the overridden FilteredCollection methods (say, for example, there's a "count" method that Collection relies upon when iterating its objects, but now it's getting the filtered count, because I overrode the count method to give the correct external behaviour).
What's wrong here? Why does it feel like some important principles are being violated despite the fact that it also feels like OO 'should' work this way? What's a general solution to this issue? Is there one?
I know, by the way, that a good 'solution' to this problem in particular would be to filter the objects before I put them into the collection, and not have to change Collection at all, but I'm asking a more general question than that - this is just an example. The more general issue is methods in an opaque superclass that rely on the behaviour of other methods that could be changed by subclasses, and what to do in the case that you want to subclass an object to change behaviour like this.
The Collection that you inherit from has a certain contract. Users of the class (and that includes the class itself, because it can call its own methods) assume that subclasses obey the contract. If you're lucky, the contract is specified clearly and unambiguously in its documentation...
For example, the contract could say: "if I add an element x, then iterate over the collection, I should get x back". It seems that your FilteredCollection implementation breaks that contract.
There is another problem here: Collection should be an interface, not a concrete implementation. An implementation (e.g. TreeSet) should implement that interface, and of course also obey its contract.
In this case, I think the correct design would be not to inherit from Collection, but rather create FilteredCollection as a "wrapper" around it. Probably FilteredCollection should not implement the Collection interface, because it does not obey the usual contract for collections.
Rather than sublcassing Collection to implement FilteredCollection, try implementing FilteredCollection as a separate class that implements iCollection and delegates to an existing collection. This is similar to the Decorator pattern from the Gang of Four.
Partial example:
class FilteredCollection implements ICollection
{
private ICollection baseCollection;
public FilteredCollection(ICollection baseCollection)
{
this.baseCollection = baseCollection;
}
public GetItems()
{
return Filter(baseCollection.GetItems());
}
private Filter(...)
{
//do filter here
}
}
Implementing FilteredCollection as a decorator for ICollection has the added benefit that you can filter anything that implements ICollection, not just the one class you subclassed.
For added goodness, you can use the Command pattern to inject a specific implementation of Filter() into the FilteredCollection at runtime, eliminating the need to write a different FilteredCollection implementation for every filter you want to apply.
(Note whilst I'll use your example I'll try to concentrate on the concept rather then tell you what's wrong with your specific example).
Black Box Inheritance?
What you're crashing into is the myth of "Black box inheritance". Its often not actually possible to separate completely implementations that allow inheritance from implementations that use that inheritance. I know this flys in the face of how inheritance is often taught but there it is.
To take your example, its quite reasonable for you to want the consumers of the collection contract to see a Count which matches the number items they can get out of your collection. Its also quite reasonable for code in the inherited base class to access its Count property and get what it expects. Something has to give.
Who is Responsible?
Answer: The base class. To achieve both the goals above the base class needs to handle things differently. Why is this the reponsibility of the base class? Because it allows itself to be inherited from and allowed the member implementation to be overriden. Now it may be in some languages that facilitate an OO design that you aren't given a choice. However that just makes this problem harder to deal with but it still needs be dealt with.
In the example, the base collection class should have its own internal means of determining its actual count in the knowledge that a sub-class may override the existing implementation of Count. Its own implementation of the public and overridable Count property should not impact on the internal operation of the base class but just be a means to acheive the external contract it is implementing.
Of course this means the implementation of the base class isn't as crisp and clean as we would like. That's what I mean by the black box inheritance being a myth, there is some implementation cost just to allow inheritance.
The Bottom Line...
is an inheritable class needs to be coded defensively so that it doesn't rely on assumed operation of overridable members. OR it needs to be very clear in some form of documentation exactly what behaviour is expected from overriden implementations of members (this is common in classes that define abstract members).
Your FilteredCollection feels wrong. Usually, when you have a collection and you add a new element into it, you expect that it's count increases by one, and the new element is added to the container.
Your FilteredCollection does not work like this - if you add an item that is filtered, the count of the container might not change. I think this is where your design goes wrong.
If that behaviour is intended, then the contract for count makes it unsuitable for the purpose your member functions are trying to use it for.
I think that the real issue is a misunderstanding of how object-oriented languages are supposed to work. I'm guessing that you have code that looks something like this:
Collection myCollection = myFilteredCollection;
And expect to invoke the methods implemented by the Collection class. Correct?
In a C++ program, this might work, provided that the methods on Collection are not defined as virtual methods. However, this is an artifact of the design goals of C++.
In just about every other object-oriented language, all methods are dispatched dynamically: they use the type of the actual object, not the type of the variable.
If that's not what you're wondering, then read up on the Liskov Substitution Principle, and ask yourself whether you're breaking it. Lots of class hierarchies do.
What you described is a quirk of polymorphism. Since you can address an instance of a subclass as an instance of the parent class, you may not know what kind of implementation lies underneath the covers.
I think your solution is pretty simple:
You stated that you don't modify the collection, you only apply a filter to it when people fetch from it. Therefore you should not override the count method. All of those elements are in the collection therefore don't lie to the caller.
You want the base .count method to behave normally, but you still want the count so you should implement a getFilteredCount method which returns the amount of elements post filtering.
Subclassing is all about the 'Kind of' relationship. What you're doing is not out of the norm but not the most standard use case either. You're applying a filter to a collection, so you can claim that a 'FilteredCollection' is a 'kind of' collection, but in reality you're not actually modifying the collection; you're just wrapping it with a layer that simplifies filtering. In any case, this should work. The only downside is that you have to remember to call 'getFilteredCount' instead of .getCount
The example falls into "Doctor, it hurts when I do this" category. Yes, subclasses can break superclasses in various ways. No, there is no simple waterproof solution to prevent that.
You can seal your superclass (make everything final) if your language supports this but then you lose flexibility. This is the bad kind of defensive programming (the good relies on robust code, the bad relies on strong restrictions).
The best you can do is to act at human level - make sure that the human that writes the subclass understands the superclass. Tutoring/code review, good documentation, unit tests (in roughly this order of importance) can help achieve this. And of course it doesn't hurt to code the base class defensively.
You could argue that the superclass is not well-designed for subclassing, at least not in the way you want to. When the superclass calls "Count()" or "Next()" or whatever, it doesn't have to let that call be overridden. In c++, it can't be overridden unless it's declared "virtual", but that doesn't apply in all languages - for example, Obj-C is inherently virtual if I remember correctly.
It's even worse - this problem can happen to you even if you don't override methods in the superclass - see Subtyping vs Subclassing. See in particular the OOP problems reference in that article.
It behaves this way because this is how object-oriented programming is supposed to work!
The whole point of OOP is supposed to be that a sub-class can redefine some of its superclasses methods, and then operations done at the superclass level will get the subclass implementation.
Let's make your example a little more concrete. We create a "Collection animal" that contains dog, cat, lion, and basilisk. Then we create a FilteredCollection domesticAnimal that filters out the lion and basilisk. So now if we iterate over domesticAnimal we expect to see only dog and cat. If we ask for a count of the number of members, would we not expect the result to be "2"? It would surely be curious behavior if we asked the object how many members it had and it said "4", and then when we asked it to list them it only listed 2.
Making the overrides work at the superclass level is an important feature of OOP. It allows us to define a function that takes, in your example, a Collection object as a parameter and operates on it, without knowing or caring whether underneath it is really a "pure" Collection or a FilteredCollection. Everything should work either way. If it's a pure Collection it gets the pure Collection functions; if it's a FilteredCollection it gets the FilteredCollection functions.
If the count is also used internally for other purposes -- like deciding where new elements should go, so that you add what is really a fifth element and it mysteriously overwrites #3 -- then you have a problem in the design of the classes. OOP gives you great power over how classes operate, but with great power comes great responsibility. :-) If a function is used for two different purposes, and you override the implementation to satisfy your requirements for purpose #1, it's up to you to make sure that that doesn't break purpose #2.
My first reaction to your post was the mention of overriding "all the accessors." This is something I've seen a lot of: extending a base class then overriding most of the base class methods. This defeats the purpose of inheritance in my opinion. If you need to override most base class functions then it's time to reconsider why you're extending the class. As said before, an interface may be a better solution, since it loosely couples disparate objects. The sub-class should EXTEND the functionality of the base class, not completely rewrite it.
I couldn't help but wonder if you are overriding the base class members then it would seem quite logical that unexpected behavior would occur.
When I first grok'd how inheritance worked I used it a lot. I had these big trees with everything connected one way or another.
What a pain.
For what you want, you should be referencing your object, not extending it.
Also, I'd personally hide any trace of passing a collection from my public API (and, in general, my private API as well). Collections are impossible to make safe. Wrapping a collection (Come on, what's it used for??? You can guess just from the signature, right?) inside a WordCount class or a UsersWithAges class or a AnimalsAndFootCount class can make a lot more sense.
Also having methods like wordCount.getMostUsedWord(), usersWithAges.getUsersOverEighteen() and animalsAndFootCount.getBipeds() method moves repetitive utility functionality scattered throughout your code into your new-fangled business collection where it belongs.
This question already has answers here:
Why use getters and setters/accessors?
(37 answers)
Closed 7 years ago.
Allen Holub wrote the following,
You can't have a program without some coupling. Nonetheless, you can minimize coupling considerably by slavishly following OO (object-oriented) precepts (the most important is that the implementation of an object should be completely hidden from the objects that use it). For example, an object's instance variables (member fields that aren't constants), should always be private. Period. No exceptions. Ever. I mean it. (You can occasionally use protected methods effectively, but protected instance variables are an abomination.)
Which sounds reasonable, but he then goes on to say,
You should never use get/set functions for the same reason—they're just overly complicated ways to make a field public (though access functions that return full-blown objects rather than a basic-type value are reasonable in situations where the returned object's class is a key abstraction in the design).
Which, frankly, just sounds insane to me.
I understand the principle of information hiding, but without accessors and mutators you couldn't use Java beans at all. I don't know how you would follow a MVC design without accessors in the model, since the model can not be responsible for rendering the view.
However, I am a younger programmer and I learn more about Object Oriented Design everyday. Perhaps someone with more experience can weigh in on this issue.
Allen Holub's articles for reference
Why Extends Is Evil
Why Getter And Setter Methods Are Evil
Related Questions:
Java: Are Getters and Setters evil?
Is it really that wrong not using setters and getters?
Are get and set functions popular with C++ programmers?
Should you use accessor properties from within the class, or just from outside of the class?
I don't have a problem with Holub telling you that you should generally avoid altering the state of an object but instead resort to integrated methods (execution of behaviors) to achieve this end. As Corletk points out, there is wisdom in thinking long and hard about the highest level of abstraction and not just programming thoughtlessly with getters/setters that just let you do an end-run around encapsulation.
However, I have a great deal of trouble with anyone who tells you that you should "never" use setters or should "never" access primitive types. Indeed, the effort required to maintain this level of purity in all cases can and will end up causing more complexity in your code than using appropriately implemented properties. You just have to have enough sense to know when you are skirting the rules for short-term gain at the expense of long-term pain.
Holub doesn't trust you to know the difference. I think that knowing the difference is what makes you a professional.
Read through that article carefully. Holub is pushing the point that getters and setters are an evil "default antipattern", a bad habit that we slip into when designing a system; because we can.
The thought process should be along the lines; What does this object do? What are its responsibilities? What are its behaviours? What does it know? Thinking long and hard on these questions leads you naturally towards designing classes which expose the highest-level interface possible.
A car is a good example. It exposes a well-defined, standardised high-level interface. I don't concern myself with setSpeed(60)... is that MPH or km/h? I just accelerate, cruise, decelerate. I don't have to think about the details in setSteeringWheelAngle(getSteeringWheelAngle()+Math.rad(-1.5)), I just turn(-1.5), and the details are taken care of under the hood.
It boils down to "You can and should figure out what every class will be used for, what it does, what it represents, and expose the highest level interface possible which fulfills those requirements. Getters and setters are usually a cop-out, when the programmer is just to lazy to do the analysis required to determine exactly what each class is and is-not, and so we go down the path of "it can do anything". Getters and setters are evil!
Sometimes the actual requirements for a class are unknowable ahead of time. That's cool, just cop-out and use getter/setter antipattern for now, but when you do know, through experience, what the class is being used for, you'll probably want to comeback and cleanup the dirty low level interface. Refactoring based on "stuff you wish you knew when you write the sucker in the first place" is par for the course. You don't have to know everything in order to make a start, it's just that the more you do know, the less rework is likely to be required upon the way.
That's the mentality he's promoting. Getters and setters are an easy trap to fall into.
Yes, beans basically require getters and setters, but to me a bean is a special case. Beans represent nouns, things, tangible identifiable (if not physical) objects. Not a lot of objects actually have automatic behaviours; most times things are manipulated by external forces, including humans, to make them productive things.
daisy.setColor(Color.PINK) makes perfect sense. What else can you do? Maybe a Vulcan mind-meld, to make the flower want to be pink? Hmmm?
Getters and setters have their ?evil? place. It's just, like all really good OO things, we tend to overuse them, because they are safe and familiar, not to mention simple, and therefore it might be better if noobs didn't see or hear about them, at least until they'd mastered the mind-meld thing.
I think what Allen Holub tried to say, rephrased in this article, is the following.
Getters and setters can be useful for variables that you specifically want to encapsulate, but you don't have to use them for all variables. In fact, using them for all variables is nasty code smell.
The trouble programmers have, and Allen Holub was right in pointing it out, is that they sometimes use getters/setters for all variables. And the purpose of encapsulation is lost.
(note I'm coming at this from a .NET "property" angle)
Well, simply - I don't agree with him; he makes a big fuss about the return type of properties being a bad thing because it can break your calling code - but exactly the same argument would apply to method arguments. And if you can't use methods either?
OK, method arguments could be changed as widening conversions, but.... just why... Also, note that in C# the var keyword could mitigate a lot of this perceived pain.
Accessors are not an implementation detail; they are the public API / contract. Yup, if you break the contracft you have trouble. When did that become a surprise? Likewise, it is not uncommon for accessors to be non-trivial - i.e. they do more than just wrap fields; they perform calculations, logic checks, notifications, etc. And they allow interface based abstractions of state. Oh, and polymorphism - etc.
Re the verbose nature of accessors (p3?4?) - in C#: public int Foo {get; private set;} - job done.
Ultimately, all of code is a means to express our intent to the compiler. Properties let me do that in a type-safe, contract-based, verifiable, extensible, polymorphic way - thanks. Why do I need to "fix" this?
Getters and setters are used as little more than a mask to make a private variable public.
There's no point repeating what Holub said already but the crux of it is that classes should represent behaviour and not just state.
Some opposing views are in italics:
Though getIdentity starts with "get," it's not an accessor because it doesn't just return a field. It returns a complex object that has reasonable behavior
Oh but wait... then it's okay to use accessors as long as you return objects instead of primitive types? Now that's a different story, but it's just as dumb to me. Sometimes you need an object, sometimes you need a primitive type.
Also, I notice that Allen has radically softened his position since his previous column on the same topic, where the mantra "Never use accessors" didn't suffer one single exception. Maybe he realized after a few year that accessors do serve a purpose after all...
Bear in mind that I haven't actually put any UI code into the business logic. I've written the UI layer in terms of AWT (Abstract Window Toolkit) or Swing, which are both abstraction layers.
Good one. What if you are writing your application on SWT? How "abstract" is really AWT in that case? Just face it: this advice simply leads you to write UI code in your business logic. What a great principle. After all, it's only been like at least ten years since we've identified this practice as one of the worst design decisions you can make in a project.
My problem is as a novice programmer is sometimes stumbling onto articles on the internet and give them more credence then I should. Perhaps this is one of those cases.
When ideas like these are presented to me, I like to take a look at libraries and frameworks I use and which I like using.
For example, although some will disagree, I like the Java Standard API. I also like the Spring Framework. Looking at the classes in these libraries, you will notice that very rarely there are setters and getters which are there just to expose some internal variable. There are methods named getX, but that does not mean it is a getter in the conventional sense.
So, I think he has a point, and it is this: every time you press choose "Generate getters/setters" in Eclipse (or your IDE of choice), you should take a step back and wonder what you are doing. Is it really appropriate to expose this internal representation, or did I mess up my design at some step?
I don't believe he's saying never use get/set, but rather that using get/set for a field is no better than just making the field public (e.g. public string Name vs. public string Name {get; set; }).
If get/set is used it limits the information hiding of OO which can potentially lock you into a bad interface.
In the above example, Name is a string; what if we want to change the design later to add multiple Names? The interface exposed only a single string so we can’t add more without breaking existing implementation.
However, if instead of using get/set you initially had a method such as Add(string name), internally you could process name singularly or add to a list or what not and externally call the Add method as many times as you want to add more Names.
The OO goal is to design with a level of abstraction; don’t expose more detail than you absolutely have to.
Chances are if you’ve just wrapped a primitive type with a get/set you’ve broken this tenet.
Of course, this is if you believe in the OO goals; I find that most don't, not really, they just use Objects as a convienient way to group functional code.
Public variables make sense when the class is nothing more than a bundle of data with no real coherency, or when it's really, really elementary (such as a point class). In general, if there's any variable in a class that you think probably shouldn't be public, that means that the class has some coherence, and variables have a certain relation that should be maintained, so all variables should be private.
Getters and setters make sense when they reflect some sort of coherent idea. In a polygon class, for example, the x and y coordinates of given vertices have a meaning outside the class boundary. It probably makes sense to have getters, and it likely makes sense to have setters. In a bank account class, the balance is probably stored as a private variable, and almost certainly should have a getter. If it has a setter, it needs to have logging built in to preserve auditability.
There are some advantages of getters and setters over public variables. They provide some separation of interface and implementation. Just because a point has a .getX() function doesn't mean there has to be an x, since .getX() and .setX() can be made to work just fine with radial coordinates. Another is that it's possible to maintain class invariants, by doing whatever's necessary to keep the class consistent within the setter. Another is that it's possible to have functionality that triggers on a set, like the logging for the bank account balance.
However, for more abstract classes, the member variables lose individual significance, and only make sense in context. You don't need to know all the internal variables of a C++ stream class, for example. You need to know how to get elements in and out, and how to perform various other actions. If you counted on the exact internal structure, you'd be bogged down in detail that could arbitrarily vary between compilers or versions.
So, I'd say to use private variables almost exclusively, getters and setters where they have a real meaning in object behavior, and not otherwise.
Just because getters and setters are frequently overused doesn't mean they're useless.
The trouble with getters/setters is they try to fake encapsulation but they actually break it by exposing their internals. Secondly they are trying to do two separate things - providing access to and controlling their state - and end up doing neither very well.
It breaks encapsulation because when you call a get/set method you first need to know the name (or have a good idea) of the field you want to change, and second you have to know it's type eg. you couldn't call
setPositionX("some string");
If you know the name and type of the field, and the setter is public, then anyone can call the method as if it were a public field anyway, it's just a more complicated way of doing it, so why not just simplify it and make it a public field in the first place.
By allowing access to it's state but trying to control it at the same time, a get/set method just confuses things and ends up either being useless boiler-plate, or misleading, by not actually doing what it says it does by having side-effects the user might not expect. If error checking is needed, it could be called something like
public void tryPositionX(int x) throws InvalidParameterException{
if (x >= 0)
this.x = x;
else
throw new InvalidParameterException("Holy Negatives Batman!");
}
or if additional code is needed it could be called a more accurate name based on what the whole method does eg.
tryPositionXAndLog(int x) throws InvalidParameterException{
tryPositionX(x);
numChanges++;
}
IMHO needing getters/setters to make something work is often a symptom of bad design.
Make use of the "tell, don't ask" principle, or re-think why an object needs to send it's state data in the first place. Expose methods that change an object's behaviour instead of it's state. Benefits of that include easier maintenance and increased extensibility.
You mention MVC too and say a model can't be responsible for it's view, for that case Allen Holub gives an example of making an abstraction layer by having a "give-me-a-JComponent-that-represents-your-identity class" which he says would "isolate the way identities are represented from the rest of the system." I'm not experienced enough to comment on whether that would work or not but on the surface it sounds a decent idea.
Public getters/setters are bad if they provide access to implementation details. Yet, it is reasonable to provide access to object's properties and use getters/setters for this. For example, if Car has the color property, it's acceptable to let clients "observe" it using a getter. If some client needs the ability to recolor a car, the class can provide a setter ('recolor' is more clear name though). It is important to do not let clients know how properties are stored in objects, how they are maintained, and so on.
Ummmm...has he never heard of the concept of Encapsulation. Getter and Setter methods are put in place to control access to a Class' members. By making all fields publicly visible...anybody could write whatever values they wanted to them thereby completely invalidating the entire object.
Just in case anybody is a little fuzzy on the concept of Encapsulation, read up on it here:
Encapsulation (Computer Science)
...and if they're really evil, would .NET build the Property concept into the language? (Getter and Setter methods that just look a little prettier)
EDIT
The article does mention Encapsulation:
"Getters and setters can be useful for variables that you specifically want to encapsulate, but you don't have to use them for all variables. In fact, using them for all variables is nasty code smell."
Using this method will lead to extremely hard to maintain code in the long run. If you find out half way through a project that spans years that a field needs to be Encapsulated, you're going to have to update EVERY REFERENCE of that field everywhere in your software to get the benefit. Sounds a lot smarter to use proper Encapsulation up front and safe yourself the headache later.
I think that getters and setters should only be used for variables which one needs to access or change outside a class. That being said, I don't believe variables should be public unless they're static. This is because making variables public which aren't static can lead to them being changed undesirably. Let's say you have a developer who is carelessly using public variables. He then accesses a variable from another class and without meaning to, changes it. Now he has an error in his software as a result of this mishap. That's why I believe in the proper use of getters and setters, but you don't need them for every private or protected variable.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
One of the biggest advantages of object-oriented programming is encapsulation, and one of the "truths" we've (or, at least, I've) been taught is that members should always be made private and made available via accessor and mutator methods, thus ensuring the ability to verify and validate the changes.
I'm curious, though, how important this really is in practice. In particular, if you've got a more complicated member (such as a collection), it can be very tempting to just make it public rather than make a bunch of methods to get the collection's keys, add/remove items from the collection, etc.
Do you follow the rule in general? Does your answer change depending on whether it's code written for yourself vs. to be used by others? Are there more subtle reasons I'm missing for this obfuscation?
It depends. This is one of those issues that must be decided pragmatically.
Suppose I had a class for representing a point. I could have getters and setters for the X and Y coordinates, or I could just make them both public and allow free read/write access to the data. In my opinion, this is OK because the class is acting like a glorified struct - a data collection with maybe some useful functions attached.
However, there are plenty of circumstances where you do not want to provide full access to your internal data and rely on the methods provided by the class to interact with the object. An example would be an HTTP request and response. In this case it's a bad idea to allow anybody to send anything over the wire - it must be processed and formatted by the class methods. In this case, the class is conceived of as an actual object and not a simple data store.
It really comes down to whether or not verbs (methods) drive the structure or if the data does.
As someone having to maintain several-year-old code worked on by many people in the past, it's very clear to me that if a member attribute is made public, it is eventually abused. I've even heard people disagreeing with the idea of accessors and mutators, as that's still not really living up to the purpose of encapsulation, which is "hiding the inner workings of a class". It's obviously a controversial topic, but my opinion would be "make every member variable private, think primarily about what the class has got to do (methods) rather than how you're going to let people change internal variables".
Yes, encapsulation matters. Exposing the underlying implementation does (at least) two things wrong:
Mixes up responsibilities. Callers shouldn't need or want to understand the underlying implementation. They should just want the class to do its job. By exposing the underlying implementation, you're class isn't doing its job. Instead, it's just pushing the responsibility onto the caller.
Ties you to the underlying implementation. Once you expose the underlying implementation, you're tied to it. If you tell callers, e.g., there's a collection underneath, you cannot easily swap the collection for a new implementation.
These (and other) problems apply regardless of whether you give direct access to the underlying implementation or just duplicate all the underlying methods. You should be exposing the necessary implementation, and nothing more. Keeping the implementation private makes the overall system more maintainable.
I prefer to keep members private as long as possible and only access em via getters, even from within the very same class. I also try to avoid setters as a first draft to promote value style objects as long as it is possible. Working with dependency injection a lot you often have setters but no getters, as clients should be able to configure the object but (others) not get to know what's acutally configured as this is an implementation detail.
Regards,
Ollie
I tend to follow the rule pretty strictly, even when it's just my own code. I really like Properties in C# for that reason. It makes it really easy to control what values it's given, but you can still use them as variables. Or make the set private and the get public, etc.
Basically, information hiding is about code clarity. It's designed to make it easier for someone else to extend your code, and prevent them from accidentally creating bugs when they work with the internal data of your classes. It's based on the principle that nobody ever reads comments, especially ones with instructions in them.
Example: I'm writing code that updates a variable, and I need to make absolutely sure that the Gui changes to reflect the change, the easiest way is to add an accessor method (aka a "Setter"), which is called instead of updating data is updated.
If I make that data public, and something changes the variable without going through the Setter method (and this happens every swear-word time), then someone will need to spend an hour debugging to find out why the updates aren't being displayed. The same applies, to a lesser extent, to "Getting" data. I could put a comment in the header file, but odds are that no-one will read it till something goes terribly, terribly wrong. Enforcing it with private means that the mistake can't be made, because it'll show up as an easily located compile-time bug, rather than a run-time bug.
From experience, the only times you'd want to make a member variable public, and leave out Getter and Setter methods, is if you want to make it absolutely clear that changing it will have no side effects; especially if the data structure is simple, like a class that simply holds two variables as a pair.
This should be a fairly rare occurence, as normally you'd want side effects, and if the data structure you're creating is so simple that you don't (e.g a pairing), there will already be a more efficiently written one available in a Standard Library.
With that said, for most small programs that are one-use no-extension, like the ones you get at university, it's more "good practice" than anything, because you'll remember over the course of writing them, and then you'll hand them in and never touch the code again. Also, if you're writing a data structure as a way of finding out about how they store data rather than as release code, then there's a good argument that Getters and Setters will not help, and will get in the way of the learning experience.
It's only when you get to the workplace or a large project, where the probability is that your code will be called to by objects and structures written by different people, that it becomes vital to make these "reminders" strong. Whether or not it's a single man project is surprisingly irrelevant, for the simple reason that "you six weeks from now" is as different person as a co-worker. And "me six weeks ago" often turns out to be lazy.
A final point is that some people are pretty zealous about information hiding, and will get annoyed if your data is unnecessarily public. It's best to humour them.
C# Properties 'simulate' public fields. Looks pretty cool and the syntax really speeds up creating those get/set methods
Keep in mind the semantics of invoking methods on an object. A method invocation is a very high level abstraction that can be implemented my the compiler or the run time system in a variety of different ways.
If the object who's method you are invoking exists in the same process/ memory map then a method could well be optimized by a compiler or VM to directly access the data member. On the other hand if the object lives on another node in a distributed system then there is no way that you can directly access it's internal data members, but you can still invoke its methods my sending it a message.
By coding to interfaces you can write code that doesn't care where the target object exists or how it's methods are invoked or even if it's written in the same language.
In your example of an object that implements all the methods of a collection, then surely that object actually is a collection. so maybe this would be a case where inheritance would be better than encapsulation.
It's all about controlling what people can do with what you give them. The more controlling you are the more assumptions you can make.
Also, theorectically you can change the underlying implementation or something, but since for the most part it's:
private Foo foo;
public Foo getFoo() {}
public void setFoo(Foo foo) {}
It's a little hard to justify.
Encapsulation is important when at least one of these holds:
Anyone but you is going to use your class (or they'll break your invariants because they don't read the documentation).
Anyone who doesn't read the documentation is going to use your class (or they'll break your carefully documented invariants). Note that this category includes you-two-years-from-now.
At some point in the future someone is going to inherit from your class (because maybe an extra action needs to be taken when the value of a field changes, so there has to be a setter).
If it is just for me, and used in few places, and I'm not going to inherit from it, and changing fields will not invalidate any invariants that the class assumes, only then I will occasionally make a field public.
My tendency is to try to make everything private if possible. This keeps object boundaries as clearly defined as possible and keeps the objects as decoupled as possible. I like this because when I have to rewrite an object that I botched the first (second, fifth?) time, it keeps the damage contained to a smaller number of objects.
If you couple the objects tightly enough, it may be more straightforward just to combine them into one object. If you relax the coupling constraints enough you're back to structured programming.
It may be that if you find that a bunch of your objects are just accessor functions, you should rethink your object divisions. If you're not doing any actions on that data it may belong as a part of another object.
Of course, if you're writing a something like a library you want as clear and sharp of an interface as possible so others can program against it.
Fit the tool to the job... recently I saw some code like this in my current codebase:
private static class SomeSmallDataStructure {
public int someField;
public String someOtherField;
}
And then this class was used internally for easily passing around multiple data values. It doesn't always make sense, but if you have just DATA, with no methods, and you aren't exposing it to clients, I find it a quite useful pattern.
The most recent use I had of this was a JSP page where I had a table of data being displayed, defined at the top declaratively. So, initially it was in multiple arrays, one array per data field... this ended in the code being rather difficult to wade through with fields not being next to eachother in definition that would be displayed together... so I created a simple class like above which would pull it together... the result was REALLY readable code, a lot more so than before.
Moral... sometimes you should consider "accepted bad" alternatives if they may make the code simpler and easier to read, as long as you think it through and consider the consequences... don't blindly accept EVERYTHING you hear.
That said... public getters and setters is pretty much equivalent to public fields... at least essentially (there is a tad more flexibility, but it is still a bad pattern to apply to EVERY field you have).
Even the java standard libraries has some cases of public fields.
When I make objects meaningful they are easier to use and easier to maintain.
For example: Person.Hand.Grab(howquick, howmuch);
The trick is not to think of members as simple values but objects in themselves.
I would argue that this question does mix-up the concept of encapsulation with 'information hiding'
(this is not a critic, since it does seem to match a common interpretation of the notion of 'encapsulation')
However for me, 'encapsulation' is either:
the process of regrouping several items into a container
the container itself regrouping the items
Suppose you are designing a tax payer system. For each tax payer, you could encapsulate the notion of child into
a list of children representing the children
a map of to takes into account children from different parents
an object Children (not Child) which would provide the needed information (like total number of children)
Here you have three different kinds of encapsulations, 2 represented by low-level container (list or map), one represented by an object.
By making those decisions, you do not
make that encapsulation public or protected or private: that choice of 'information hiding' is still to be made
make a complete abstraction (you need to refine the attributes of object Children and you may decide to create an object Child, which would keep only the relevant informations from the point of view of a tax payer system)
Abstraction is the process of choosing which attributes of the object are relevant to your system, and which must be completely ignored.
So my point is:
That question may been titled:
Private vs. Public members in practice (how important is information hiding?)
Just my 2 cents, though. I perfectly respect that one may consider encapsulation as a process including 'information hiding' decision.
However, I always try to differentiate 'abstraction' - 'encapsulation' - 'information hiding or visibility'.
#VonC
You might find the International Organisation for Standardization's, "Reference Model of Open Distributed Processing," an interesting read. It defines: "Encapsulation: the property that the information contained in an object is accessible only through interactions at the interfaces supported by the object."
I tried to make a case for information hiding's being a critical part of this definition here:
http://www.edmundkirwan.com/encap/s2.html
Regards,
Ed.
I find lots of getters and setters to be a code smell that the structure of the program is not designed well. You should look at the code that uses those getters and setters, and look for functionality that really should be part of the class. In most cases, the fields of a class should be private implementation details and only the methods of that class may manipulate them.
Having both getters and setters is equal to the field being public (when the getters and setters are trivial/generated automatically). Sometimes it might be better to just declare the fields public, so that the code will be more simple, unless you need polymorphism or a framework requires get/set methods (and you can't change the framework).
But there are also cases where having getters and setters is a good pattern. One example:
When I create the GUI of an application, I try to keep the behaviour of the GUI in one class (FooModel) so that it can be unit tested easily, and have the visualization of the GUI in another class (FooView) which can be tested only manually. The view and model are joined with simple glue code; when the user changes the value of field x, the view calls setX(String) on the model, which in turn may raise an event that some other part of the model has changed, and the view will get the updated values from the model with getters.
In one project, there is a GUI model which has 15 getters and setters, of which only 3 get methods are trivial (such that the IDE could generate them). All the others contain some functionality or non-trivial expressions, such as the following:
public boolean isEmployeeStatusEnabled() {
return pinCodeValidation.equals(PinCodeValidation.VALID);
}
public EmployeeStatus getEmployeeStatus() {
Employee employee;
if (isEmployeeStatusEnabled()
&& (employee = getSelectedEmployee()) != null) {
return employee.getStatus();
}
return null;
}
public void setEmployeeStatus(EmployeeStatus status) {
getSelectedEmployee().changeStatusTo(status, getPinCode());
fireComponentStateChanged();
}
In practice I always follow only one rule, the "no size fits all" rule.
Encapsulation and its importance is a product of your project. What object will be accessing your interface, how will they be using it, will it matter if they have unneeded access rights to members? those questions and the likes of them you need to ask yourself when working on each project implementation.
I base my decision on the Code's depth within a module.
If I'm writting code that is internal to a module, and does not interface with the outside world I don't encapsulate things with private as much because it affects my programmer performance (how fast I can write and rewrite my code).
But for the objects that server as the module's interface with user code, then I adhere to strict privacy patterns.
Certainly it makes a difference whether your writing internal code or code to be used by someone else (or even by yourself, but as a contained unit.) Any code that is going to be used externally should have a well defined/documented interface that you'll want to change as little as possible.
For internal code, depending on the difficulty, you may find it's less work to do things the simple way now, and pay a little penalty later. Of course Murphy's law will ensure that the short term gain will be erased many times over in having to make wide-ranging changes later on where you needed to change a class' internals that you failed to encapsulate.
Specifically to your example of using a collection that you would return, it seems possible that the implementation of such a collection might change (unlike simpler member variables) making the utility of encapsulation higher.
That being said, I kinda like Python's way of dealing with it. Member variables are public by default. If you want to hide them or add validation there are techniques provided, but those are considered the special cases.
I follow the rules on this almost all the time. There are four scenarios for me - basically, the rule itself and several exceptions (all Java-influenced):
Usable by anything outside of the current class, accessed via getters/setters
Internal-to-class usage typically preceded by 'this' to make it clear that it's not a method parameter
Something meant to stay extremely small, like a transport object - basically a straight shot of attributes; all public
Needed to be non-private for extension of some sort
There's a practical concern here that isn't being addressed by most of the existing answers. Encapsulation and the exposure of clean, safe interfaces to outside code is always great, but it's much more important when the code you're writing is intended to be consumed by a spatially- and/or temporally-large "user" base. What I mean is that if you plan on somebody (even you) maintaining the code well into the future, or if you're writing a module that will interface with code from more than a handful of other developers, you need to think much more carefully than if you're writing code that's either one-off or wholly written by you.
Honestly, I know what wretched software engineering practice this is, but I'll oftentimes make everything public at first, which makes things marginally faster to remember and type, then add encapsulation as it makes sense. Refactoring tools in most popular IDEs these days makes which approach you use (adding encapsulation vs. taking it away) much less relevant than it used to be.