Application design - When should interfaces be used? - language-agnostic

I kind of understand an interface as being a contract that can be applied to classes that would otherwise have nothing in common (ex: Comparable in Java). However, in what situation(s) would you have the reflex of adding an interface at the design stage?

Whenever you are using a statically typed language, and you want to make it possible for the developer to use your code while providing an alternate implementation - in other words, in such language it is necessary to achieve low(er) coupling.
Languages that use ducktyping as a rule, rather than strict type checking, for example, python, would generally have no need for interfaces.

"I kind of understand an interface as being a contract that can be applied to classes that would otherwise have nothing in common" - that's probably not the way to think about what an Interface is.
An Interface describes behaviour, and implementing an interface means a class enters into a contract to deliver that behavior.
By programming to an interface, rather than an implementation, you enable polymorphism and get more flexible code with lower coupling. For example, this method can take any instance that implements IQuack:
public void DoSomething(IQuack quacker)
{
// ...
}

If you are designing a product and you know the product is going to interact with a type of device, service etc. but not necessarily which, you can use an interface to move forward with the overall architecture, PROVIDED that you know enough about those types of devices to write an interface that can be successfully used by any given device of that type. Of course if you are in the design phase, you better have that knowledge. It's not uncommon to do high level designs using only interface declarations. I'm not saying it's good or bad, but it seems to be a pretty common practice of those who use software (like Rose etc) to generate a skeleton from UML.
Another time would be if you know exactly what device you are going to use but you think there might be a chance that you will need to work with different or multiple types of that device down the road.
A third usage of interfaces is to reduce duplicated code. This is probably the only place people ever get carried away with interface usage and if it wasnt for that, I'd be comfortable saying dont ask "Should this be an interface?" but "Can this be an interface?".

Related

Why should you ever have to care whether an object reference is an interface or a class?

I often seem to run into the discussion of whether or not to apply some sort of prefix/suffix convention to interface type names, typically adding "I" to the beginning of the name.
Personally I'm in the camp that advocates no prefix, but that's not what this question is about. Rather, it's about one of the arguments I often hear in that discussion:
You can no longer see at-a-glance
whether something is an interface or a
class.
The question that immediately pops up in my head is: apart from object creation, why should you ever have to care whether an object reference is a class or an interface?
I've tagged this question as language agnostic, but as has been pointed out it may not be. I contend that it is because while specific language implementation details may be interesting, I'd like to keep this on a conceptual level. In other words, I think that, conceptually, you'd never have to care whether an object reference is typed as a class or an interface but I'm not sure, hence the question.
This is not a discussion about IDEs and what they do or don't do when visualizing the different types; caring about the type of an object is certainly a necessity when browsing through code (packages/sources/whatever form). Nor is it a discussion about the pros or cons about either naming convention. I just can't seem to figure out in what scenario, other than object creation, you actually care about wether or not you're referencing a concrete type or an interface.
Most of the time, you probably don't care. But here are some instances that I can think of where you would. There are several, and it does vary a little bit by language. Some languages don't mind as much as others.
In the case of inversion of control (where someone PASSES you a parameter) you probably don't care if it's an interface or an object as far as calling its methods etc. But when dealing with types, it definitely can make a difference.
In managed languages such as .NET languages, interfaces can usually only inherit one interface, whereas a class can inherit one class but implement many interfaces. The order of classes vs interfaces may also matter in a class or interface declaration. So you need to know which is which when defining a new class or interface.
In Delphi / VCL, interfaces are reference counted and automatically collected, whereas classes must be explicitly freed, so lifecyle management on the whole is affected, not just the creation.
Interfaces may not be viable sources for class references.
Interfaces can be cast to compatible interfaces, but in many languages, they cannot be cast to compatible classes. Classes can be cast to either.
Interfaces may be passed to parameters of type IID, or IUnknown, whereas classes cannot (without a cast and a supporting interface).
An interface's implementation is unknown. Its input and output are defined, but the implementation which creates the output is abstracted. In general, ones attitude may be that when working with a class, one may know how the class works. But when working with an interface, no such assumption should be made. In a perfect world, it might make no difference. But in reality, this most certainly can have affect your design.
I agree with you (and thereby do not use an "I" prefix for interfaces). We shouldn't have to care whether it is an abstract class or an interface.
Worth noting that Java needs to have a notion of interface solely because it does not support multiple inheritance. Otherwise, "abstract class" concept would suffice (which may be "all" abstract, or partially abstract, or almost concrete and just 1 tiny bit abstract, whatever).
Things that concrete class can have and the interfaces can't:
Constructors
Instance fields
Static methods and static fields
So if you use the convention of starting all interface names with 'I' then it indicates to the user of your library that the particular type will not have any of the above mentioned things.
But personally I feel that this is not a reason enough to start all interface names with 'I'. The modern IDEs are powerful enough to indicate if some type is an interface. Also it hides the true meaning of an interface name: imagine if Runnable and List interfaces were named IRunnable and IList repectively.
When a class is used, I can make the assumption that I will get objects from a relatively small and almost well-defined range of subclasses. That's because subclassing is - or at least it should be
- a decision that isn't made too easily, especially in languages that don't support multiple inheritance. In contrast, interfaces can be implemented by any class, and the implementation can be added later to any class.
So the information is useful, especially when browsing through code, and trying to get a feeling what the code author intended to do - but I think it should be enough, if the IDE shows interfaces/classes as distinctive icons.
You want to see at a glance which are the "interfaces" and which are the "concrete classes" so that you can focus your attention to the abstractions in the design instead of the details.
Good designs are based on abstractions - if you know and understand them you understand the system without knowing any of the details. So you know you can skip the classes without the I prefix, and focus on the ones that do have it while you are understanding the code, and you also know to avoid building new code around non-interface classes without having to refer to some other design document.
I agree that the I* naming convention is just not appropriate for modern OO languages, but truth is this question isn't really language agnostic. There are legitimate cases where you have an interface not for any architectural reason but because you simply don't have an implementation or have access to an implementation. For these cases you can read I* as *Stub or similar, and, in these cases, it might make sense to have an IBlah and a Blah class
These days, though, you rarely come up against this, and in modern OO languages when you say Interface you actually mean Interface not just I don't have the code for this. So there is no need for the I*, and in fact it encourages really bad OO design as you won't get the natural naming conflicts that would tell you something's gone wrong in your architecture. Say you had a List and an IList... what's the difference? when would you use one over the other? if you wanted to implement IList would you be constrained (conceptually at least) by what List does? I'll tell you what... if I found both an IBlah and a Blah class in any of my codebases I would purge one at random and take away that person's commit privileges.
Interfaces don't have fields, hence when you use IDisposable (or whatever), you know you're only declaring what you can do. That seems to me the main point of it.
Distinguishing between interfaces and classes may be useful, anywhere the type is referenced, in the IDE or out, to determine:
Can I make a new implementation of this type?
Can I implement this interface in a language that does not support multiple inheritance of implementation classes (e.g., Java).
Can there be multiple implementations of this type?
Can I easily mock this interface in an arbitrary mocking framework?
It is worth noting that UML distinguishes between interfaces and implementation classes. In addition, the "I" prefix is used in the examples in "The Unified Modeling Language User Guide" by the three amigos Booch, Jacobson and Rumbaugh. (Incidentally, this also provides an example why IDE syntax coloring alone is not sufficient to distinguish in all contexts.)
You should care, because :
An interface with capital "I" enables one, namely you or your co-workers to use any implementation which implements the interface. If in the future you figure out a better way to do something, say a better list sorting algorithm, you will be stuck with having the change ALL of the invoking methods as well.
It helps in understanding code - e.g. you don't need to memorize all 10 implementations of say, I_SortableList , you just care that it sorts a list (or something like that). Your code becomes practically self-documenting here.
To complete the discussion, here is a pseudocode example illustrating the above:
//Pseudocode - define implementations of ISortableList
Class SortList1 : ISortableLIst, SortList2:IsortableList, SortList3:IsortableList
//PseudoCode - the interface way
void Populate(ISortableList list, int[] nums)
{
list.set(nums)
}
//PseudoCode - the "i dont care way"
void Populate2( SortList1 list, int[] nums )
{
list.set(nums)
}
...
//Pseudocode - create instances
SortList1 list1 = new SortList1();
SortList2 list2 = new SortList2();
SortList3 list3 = new SortList3();
//Invoke Populate() - The "interface way"
Populate(list1,nums);//OK, list1 is ISortableList implementation
Populate(list2,nums);//OK, list2 is ISortableList implementation
Populate(list3,nums);//OK, list3 is ISortableList implementation
//Invoke Populate2() - the "I don't care way"
Populate(list1,nums);//OK, list1 is an instance of SortList1
Populate(list2,nums);//Not OK, list2 is not of required argument type, won't compile
Populate(list3,nums);//the same as above
Hope this helps,
Jas.

Interfaces vs Public Class Members

I've noticed that some programmers like to make interfaces for just about all their classes. I like interfaces for certain things (such as checking if an object supports a certain behavior and then having an interface for that behavior) but overuse of interfaces can sometimes bloat the code. When I declare methods or properties as public I'd expect people to just use my concrete classes and I don't really understand the need to create interfaces on top of that.
I'd like to hear your take on interfaces. When do you use them and for what purposes?
Thank you.
Applying any kind of design pattern or idea without thinking, just because somebody told you it's good practice, is a bad idea.
That ofcourse includes creating a separate interface for each and every class you create. You should at least be able to give a good reason for every design decision, and "because Joe says it's good practice" is not a good enough reason.
Interfaces are good for decoupling the interface of some unit of code from its implementation. A reason to create an interface is because you foresee that there might be multiple implementations of it in the future. It can also help with unit testing; you can make a mock implementation of the services that the unit you want to test depends on, and plug the mock implementations in instead of "the real thing" for testing.
Interfaces are a powerful tool for abstraction. With them, you can more freely substitute (for example) test classes and thereby decouple your code. They are also a way to narrow the scope of your code; you probably don't need the full feature set of a given class in a particular place - exactly what features do you need? That's a client-focused way of thinking about interfaces.
Unit tests.
With an interface describing all class methods and properties it is within the reach of a click to create a mock-up class to simulate behavior that is not within the scope of said test.
It's all about expecting and preparing for change.
One approach that some use (and I'm not necessarily advocating it)
is to create an IThing and a ThingFactory.
All code will reference IThing (instead of ConcreteThing).
All object creation can be done via the Factory Method.
ThingFactory.CreateThing(some params).
So, today we only have AmericanConcreteThing. And the possibility is that we may never need another. However, if experience has taught me anything, it is that we will ALWAYS need another.
You may not need EuropeanThing, but TexasAmericanThing is a distinct possibility.
So, In order to minimize the impact on my code, I can change the creational line to:
ThingFactory.CreateThing( Account )
and Create my class TexasAmericanThing : IThing.
Other than building the class, the only change is to the ThingFactory, which will require a change from
public static IThing CreateThing(Account a)
{
return new AmericanThing();
}
to
public static IThing CreateThing(Account a)
{
if (a.State == State.TEXAS) return new TexasAmericanThing();
return new AmericanThing();
}
I've seen plenty of mindless Interfaces myself. However, when used intelligently, they can save the day. You should use Interfaces for decoupling two components or two layers of an application. This can enable you to easily plug-in varying implementations of the interface without affecting the client, or simply insulate the client from constant changes to the implementation, as long as you stay true to the contract of the interface. This can make the code more maintainable in the long term and can save the effort of refactoring later.
However, overly aggressive decoupling can make for non-intuitive code. It's overuse can lead to nuisance. You should carefully identify the cohesive parts of your application and the boundaries between them and use interfaces there. Another benefit of using Interfaces between such parts is that they can be developed in parallel and tested independently using mock implementations of the interfaces they use.
OTOH, having client code access public member methods directly is perfectly okay if you really don't foresee any changes to the class that might also necessitate changes in the client. In any case, however, having public member fields I think is not good. This is extremely tight coupling! You are basically exposing the architecture of your class and making the client code dependent on it. Tomorrow if you realize that another data structure for a particular field will perform better, you can't change it without also changing the client code.
I primarily use interfaces for IoC to enable unit testing.
On the one hand, this could be interpreted as premature generalization. On the other hand, using interfaces as a rule helps you write code that is more easily composable and hence testable. I think the latter wins out in many cases.
I like interfaces:
* to define a contract between parts/modules/subsystems or 3rd party systems
* when there are exchangeable states or algorithms (state/strategy)

Should we avoid to use Object as the input parameter/ output value of a method?

Take Java syntax as an example, though the question itself is language independent. If the following snippet takes an object MyAbstractEmailTemplate as input argument in the method setTemplate, the class MyGateway will then become tightly-coupled with the object MyAbstractEmailTemplate, which lessens the re-usability of the class MyGateway.
A compromise is to use dependency-injection to ease the instantiation of MyAbstractEmailTemplate. This might solve the coupling problem
to some extent, but the interface is still rigid, hardly providing enough flexibility to
other developers/ applications.
So if we only use primitive data type (or even plain XML in web service) as the input/ output of a method, it seems the coupling problem no longer exists. So what do you think?
public class MyGateway {
protected MyAbstractEmailTemplate template;
public void setTemplate(MyAbstractEmailTemplate template) {
this.template = template;
}
}
It's pretty difficult to understand what you are really asking, but going the route of typing everything to Object does not lead to loose coupling because you can't do anything with the input without downcasting, which would break the Liskov Substituion Principle.
Taken to the extreme it leads you here:
public class MyClass
{
public object Invoke(object obj);
}
This is not loose coupling, it's just obscure and hard-to-maintain code.
The name MyAbstractEmailTemplate makes me believe that you are talking about an abstract class.
You should always program against interfaces, so instead of having MyGateway depend on MyAbstractEmailTemplate, it should depend on an EmailTemplate interface, where MyAbstractEmailTemplate implements EmailTemplate. Then, you can pass your custom implementations around as you want to, without further tight coupling.
Combine this with DI and you've got yourself a pretty decent solution.
Not exactly sure what you mean with "the interface is still rigid", but obviously you should design your interface in such a way that it provides the functionality you need.
MyGateway has to assume something about the inputs. Even if it used XML, it would have to assume something about the structure and content of the XML. Coupling isn't an evil in its own right; expresses the contract between two pieces of code. The oft-repeated advice to avoid tight coupling is really just saying that coupling should express the essence of a contract, not more and not less. Passing a specific type (particularly an interface type) is a very good way to achieve this balance.
The first problem you will run into is that a lot of types are simply not representable by a primitive data type (It's a Java problem that there are primitive types at all.).
The coupling should be reduced by using a proper inheritance hierarchy. What means proper? The method should take exactly that part of the interface as a parameter that is need. Not more not less.
After all you won't be able to avoid dependencies. Methods have to know about what they can do with their input or have to able to make assumptions (see C++ concepts) about the capabilities of the input.
IMHO there is nothing inherently wrong in using objects (wth small cap, not Objects) as method parameters and/or class members. Yes, these create dependencies. You can manage this in (at least) two ways:
acknowledge that by creating this dependency, the two classes become tightly coupled. This is entirely appropriate in many cases, where two (or more) classes in fact form a component, which is a meaningful unit of reuse in itself, and its parts may not make much sense or be interchangeable.
if there are multiple interchangeable candidates for a method parameter, these are obvious candidates to form a class hierarchy. Then you program for the interface and can pass any object of any class implementing that interface as parameter to your method. Note that the phrase "there are multiple interchangeable candidates for a method parameter" is a loose rephrasing of the Liskov Substitution Principle, which is the foundation of polymorphism.
in some languages, e.g. C++, the third way would be using templates. Then you need no common interface, only specific methods/members need to resolvable when the template is instantiated. However, since instantiation happens at compile time, this is entirely static binding.
sThe problem is I would say, that the best java can offer are interfaces and people start to see that they are too rigid. It would be interesting to use something like what is in Go language, that allows flexible checking for all methods of an interface to be present in the type, you do not have to be explicit about implementing some interface. We also need something better than interfaces to specify the constraints - maybe some sort of contracts. Another thing is the interface evolution.

Should I use an interface like IEnumerable, or a concrete class like List<>

I recently expressed my view about this elsewhere* , but I think it deserves further analysis so I'm posting this as its own question.
Let's say that I need to create and pass around a container in my program. I probably don't have a strong opinion about one kind of container versus another, at least at this stage, but I do pick one; for sake of argument, let's say I'm going to use a List<>.
The question is: Is it better to write my methods to accept and return a high level interface such as C#'s IEnumerable? Or should I write methods to take and pass the specific container class that I have chosen.
What factors and criteria should I look for to decide? What kind of programs work benefit from one or the other? Does the computer language affect your decision? Performance? Program size? Personal style?
(Does it even matter?)
**(Homework: find it. But please post your answer here before you look for my own, so as not bias you.)*
Your method should always accept the least-specific type it needs to execute its function. If your method needs to enumerate, accept IEnumerable. If it needs to do IList<>-specific things, by definition you must give it a IList<>.
The only thing that should affect your decision is how you plan to use the parameter. If you're only iterating over it, use IEnumerable<T>. If you are accessing indexed members (eg var x = list[3]) or modifying the list in any way (eg list.Add(x)) then use ICollection<T> or IList<T>.
There is always a tradeoff. The general rule of thumb is to declare things as high up the hierarchy as possible. So if all you need is access to the methods in IEnumerable then that is what you should use.
Another recent example of a SO question was a C API that took a filename instead of a File * (or file descriptor). There the filename severly limited what sores of things could be passed in (there are many things you can pass in with a file descriptor, but only one that has a filename).
Once you have to start casting you have either gone too high OR you should be making a second method that takes a more specific type.
The only exception to this that I can think of is when speed is an absolute must and you do not want to go through the expense of a virtual method call. Declaring the specific type removes the overhead of virtual functions (will depend on the language/environment/implementation, but as a general statement that is likely correct).
It was a discussion with me that prompted this question, so Euro Micelli already knows my answer, but here it is! :)
I think Linq to Objects already provides a great answer to this question. By using the simplest interface to a sequence of items it could, it gives maximum flexibility about how you implement that sequence, which allows lazy generation, boosting productivity without sacrificing performance (not in any real sense).
It is true that premature abstraction can have a cost - but mainly it is the cost of discovering/inventing new abstractions. But if you already have perfectly good ones provided to you, then you'd be crazy not to take advantage of them, and that is what the generic collection interfaces provides you with.
There are those who will tell you that it is "easier" to make all the data in a class public, just in case you will need to access it. In the same way, Euro advised that it would be better to use a rich interface to a container such as IList<T> (or even the concrete class List<T>) and then clean up the mess later.
But I think, just as it is better to hide the data members of a class that you don't want to access, to allow you to modify the implementation of that class easily later, so you should use the simplest interface available to refer to a sequence of items. It is easier in practice to start by exposing something simple and basic and then "loosen" it later, than it is to start with something loose and struggle to impose order on it.
So assume IEnumerable<T> will do to represent a sequence. Then in those cases where you need to Add or Remove items (but still don't need by-index lookup), use IContainer<T>, which inherits IEnumerable<T> and so will be perfectly interoperable with your other code.
This way it will be perfectly clear (just from local examination of some code) precisely what that code will be able to do with the data.
Small programs require less abstraction, it is true. But if they are successful, they tend to become big programs. This is much easier if they employ simple abstractions in the first place.
It does matter, but the correct solution completely depends on usage. If you only need to do a simple enumeration then sure use IEnumerable that way you can pass any implementer to access the functionality you need. However if you need list functionality and you don't want to have to create a new instance of a list if by chance every time the method is called the enumerable that was passed wasn't a list then go with a list.
I answered a similar C# question here. I think you should always provide the simplest contract you can, which in the case of collections in my opinion, ordinarily is IEnumerable Of T.
The implementation can be provided by an internal BCL type - be it Set, Collection, List etcetera - whose required members are exposed by your type.
Your abstract type can always inherit simple BCL types, which are implemented by your concrete types. This in my opinion allows you to adhere to LSP easier.

Private vs. Public members in practice (how important is encapsulation?) [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
One of the biggest advantages of object-oriented programming is encapsulation, and one of the "truths" we've (or, at least, I've) been taught is that members should always be made private and made available via accessor and mutator methods, thus ensuring the ability to verify and validate the changes.
I'm curious, though, how important this really is in practice. In particular, if you've got a more complicated member (such as a collection), it can be very tempting to just make it public rather than make a bunch of methods to get the collection's keys, add/remove items from the collection, etc.
Do you follow the rule in general? Does your answer change depending on whether it's code written for yourself vs. to be used by others? Are there more subtle reasons I'm missing for this obfuscation?
It depends. This is one of those issues that must be decided pragmatically.
Suppose I had a class for representing a point. I could have getters and setters for the X and Y coordinates, or I could just make them both public and allow free read/write access to the data. In my opinion, this is OK because the class is acting like a glorified struct - a data collection with maybe some useful functions attached.
However, there are plenty of circumstances where you do not want to provide full access to your internal data and rely on the methods provided by the class to interact with the object. An example would be an HTTP request and response. In this case it's a bad idea to allow anybody to send anything over the wire - it must be processed and formatted by the class methods. In this case, the class is conceived of as an actual object and not a simple data store.
It really comes down to whether or not verbs (methods) drive the structure or if the data does.
As someone having to maintain several-year-old code worked on by many people in the past, it's very clear to me that if a member attribute is made public, it is eventually abused. I've even heard people disagreeing with the idea of accessors and mutators, as that's still not really living up to the purpose of encapsulation, which is "hiding the inner workings of a class". It's obviously a controversial topic, but my opinion would be "make every member variable private, think primarily about what the class has got to do (methods) rather than how you're going to let people change internal variables".
Yes, encapsulation matters. Exposing the underlying implementation does (at least) two things wrong:
Mixes up responsibilities. Callers shouldn't need or want to understand the underlying implementation. They should just want the class to do its job. By exposing the underlying implementation, you're class isn't doing its job. Instead, it's just pushing the responsibility onto the caller.
Ties you to the underlying implementation. Once you expose the underlying implementation, you're tied to it. If you tell callers, e.g., there's a collection underneath, you cannot easily swap the collection for a new implementation.
These (and other) problems apply regardless of whether you give direct access to the underlying implementation or just duplicate all the underlying methods. You should be exposing the necessary implementation, and nothing more. Keeping the implementation private makes the overall system more maintainable.
I prefer to keep members private as long as possible and only access em via getters, even from within the very same class. I also try to avoid setters as a first draft to promote value style objects as long as it is possible. Working with dependency injection a lot you often have setters but no getters, as clients should be able to configure the object but (others) not get to know what's acutally configured as this is an implementation detail.
Regards,
Ollie
I tend to follow the rule pretty strictly, even when it's just my own code. I really like Properties in C# for that reason. It makes it really easy to control what values it's given, but you can still use them as variables. Or make the set private and the get public, etc.
Basically, information hiding is about code clarity. It's designed to make it easier for someone else to extend your code, and prevent them from accidentally creating bugs when they work with the internal data of your classes. It's based on the principle that nobody ever reads comments, especially ones with instructions in them.
Example: I'm writing code that updates a variable, and I need to make absolutely sure that the Gui changes to reflect the change, the easiest way is to add an accessor method (aka a "Setter"), which is called instead of updating data is updated.
If I make that data public, and something changes the variable without going through the Setter method (and this happens every swear-word time), then someone will need to spend an hour debugging to find out why the updates aren't being displayed. The same applies, to a lesser extent, to "Getting" data. I could put a comment in the header file, but odds are that no-one will read it till something goes terribly, terribly wrong. Enforcing it with private means that the mistake can't be made, because it'll show up as an easily located compile-time bug, rather than a run-time bug.
From experience, the only times you'd want to make a member variable public, and leave out Getter and Setter methods, is if you want to make it absolutely clear that changing it will have no side effects; especially if the data structure is simple, like a class that simply holds two variables as a pair.
This should be a fairly rare occurence, as normally you'd want side effects, and if the data structure you're creating is so simple that you don't (e.g a pairing), there will already be a more efficiently written one available in a Standard Library.
With that said, for most small programs that are one-use no-extension, like the ones you get at university, it's more "good practice" than anything, because you'll remember over the course of writing them, and then you'll hand them in and never touch the code again. Also, if you're writing a data structure as a way of finding out about how they store data rather than as release code, then there's a good argument that Getters and Setters will not help, and will get in the way of the learning experience.
It's only when you get to the workplace or a large project, where the probability is that your code will be called to by objects and structures written by different people, that it becomes vital to make these "reminders" strong. Whether or not it's a single man project is surprisingly irrelevant, for the simple reason that "you six weeks from now" is as different person as a co-worker. And "me six weeks ago" often turns out to be lazy.
A final point is that some people are pretty zealous about information hiding, and will get annoyed if your data is unnecessarily public. It's best to humour them.
C# Properties 'simulate' public fields. Looks pretty cool and the syntax really speeds up creating those get/set methods
Keep in mind the semantics of invoking methods on an object. A method invocation is a very high level abstraction that can be implemented my the compiler or the run time system in a variety of different ways.
If the object who's method you are invoking exists in the same process/ memory map then a method could well be optimized by a compiler or VM to directly access the data member. On the other hand if the object lives on another node in a distributed system then there is no way that you can directly access it's internal data members, but you can still invoke its methods my sending it a message.
By coding to interfaces you can write code that doesn't care where the target object exists or how it's methods are invoked or even if it's written in the same language.
In your example of an object that implements all the methods of a collection, then surely that object actually is a collection. so maybe this would be a case where inheritance would be better than encapsulation.
It's all about controlling what people can do with what you give them. The more controlling you are the more assumptions you can make.
Also, theorectically you can change the underlying implementation or something, but since for the most part it's:
private Foo foo;
public Foo getFoo() {}
public void setFoo(Foo foo) {}
It's a little hard to justify.
Encapsulation is important when at least one of these holds:
Anyone but you is going to use your class (or they'll break your invariants because they don't read the documentation).
Anyone who doesn't read the documentation is going to use your class (or they'll break your carefully documented invariants). Note that this category includes you-two-years-from-now.
At some point in the future someone is going to inherit from your class (because maybe an extra action needs to be taken when the value of a field changes, so there has to be a setter).
If it is just for me, and used in few places, and I'm not going to inherit from it, and changing fields will not invalidate any invariants that the class assumes, only then I will occasionally make a field public.
My tendency is to try to make everything private if possible. This keeps object boundaries as clearly defined as possible and keeps the objects as decoupled as possible. I like this because when I have to rewrite an object that I botched the first (second, fifth?) time, it keeps the damage contained to a smaller number of objects.
If you couple the objects tightly enough, it may be more straightforward just to combine them into one object. If you relax the coupling constraints enough you're back to structured programming.
It may be that if you find that a bunch of your objects are just accessor functions, you should rethink your object divisions. If you're not doing any actions on that data it may belong as a part of another object.
Of course, if you're writing a something like a library you want as clear and sharp of an interface as possible so others can program against it.
Fit the tool to the job... recently I saw some code like this in my current codebase:
private static class SomeSmallDataStructure {
public int someField;
public String someOtherField;
}
And then this class was used internally for easily passing around multiple data values. It doesn't always make sense, but if you have just DATA, with no methods, and you aren't exposing it to clients, I find it a quite useful pattern.
The most recent use I had of this was a JSP page where I had a table of data being displayed, defined at the top declaratively. So, initially it was in multiple arrays, one array per data field... this ended in the code being rather difficult to wade through with fields not being next to eachother in definition that would be displayed together... so I created a simple class like above which would pull it together... the result was REALLY readable code, a lot more so than before.
Moral... sometimes you should consider "accepted bad" alternatives if they may make the code simpler and easier to read, as long as you think it through and consider the consequences... don't blindly accept EVERYTHING you hear.
That said... public getters and setters is pretty much equivalent to public fields... at least essentially (there is a tad more flexibility, but it is still a bad pattern to apply to EVERY field you have).
Even the java standard libraries has some cases of public fields.
When I make objects meaningful they are easier to use and easier to maintain.
For example: Person.Hand.Grab(howquick, howmuch);
The trick is not to think of members as simple values but objects in themselves.
I would argue that this question does mix-up the concept of encapsulation with 'information hiding'
(this is not a critic, since it does seem to match a common interpretation of the notion of 'encapsulation')
However for me, 'encapsulation' is either:
the process of regrouping several items into a container
the container itself regrouping the items
Suppose you are designing a tax payer system. For each tax payer, you could encapsulate the notion of child into
a list of children representing the children
a map of to takes into account children from different parents
an object Children (not Child) which would provide the needed information (like total number of children)
Here you have three different kinds of encapsulations, 2 represented by low-level container (list or map), one represented by an object.
By making those decisions, you do not
make that encapsulation public or protected or private: that choice of 'information hiding' is still to be made
make a complete abstraction (you need to refine the attributes of object Children and you may decide to create an object Child, which would keep only the relevant informations from the point of view of a tax payer system)
Abstraction is the process of choosing which attributes of the object are relevant to your system, and which must be completely ignored.
So my point is:
That question may been titled:
Private vs. Public members in practice (how important is information hiding?)
Just my 2 cents, though. I perfectly respect that one may consider encapsulation as a process including 'information hiding' decision.
However, I always try to differentiate 'abstraction' - 'encapsulation' - 'information hiding or visibility'.
#VonC
You might find the International Organisation for Standardization's, "Reference Model of Open Distributed Processing," an interesting read. It defines: "Encapsulation: the property that the information contained in an object is accessible only through interactions at the interfaces supported by the object."
I tried to make a case for information hiding's being a critical part of this definition here:
http://www.edmundkirwan.com/encap/s2.html
Regards,
Ed.
I find lots of getters and setters to be a code smell that the structure of the program is not designed well. You should look at the code that uses those getters and setters, and look for functionality that really should be part of the class. In most cases, the fields of a class should be private implementation details and only the methods of that class may manipulate them.
Having both getters and setters is equal to the field being public (when the getters and setters are trivial/generated automatically). Sometimes it might be better to just declare the fields public, so that the code will be more simple, unless you need polymorphism or a framework requires get/set methods (and you can't change the framework).
But there are also cases where having getters and setters is a good pattern. One example:
When I create the GUI of an application, I try to keep the behaviour of the GUI in one class (FooModel) so that it can be unit tested easily, and have the visualization of the GUI in another class (FooView) which can be tested only manually. The view and model are joined with simple glue code; when the user changes the value of field x, the view calls setX(String) on the model, which in turn may raise an event that some other part of the model has changed, and the view will get the updated values from the model with getters.
In one project, there is a GUI model which has 15 getters and setters, of which only 3 get methods are trivial (such that the IDE could generate them). All the others contain some functionality or non-trivial expressions, such as the following:
public boolean isEmployeeStatusEnabled() {
return pinCodeValidation.equals(PinCodeValidation.VALID);
}
public EmployeeStatus getEmployeeStatus() {
Employee employee;
if (isEmployeeStatusEnabled()
&& (employee = getSelectedEmployee()) != null) {
return employee.getStatus();
}
return null;
}
public void setEmployeeStatus(EmployeeStatus status) {
getSelectedEmployee().changeStatusTo(status, getPinCode());
fireComponentStateChanged();
}
In practice I always follow only one rule, the "no size fits all" rule.
Encapsulation and its importance is a product of your project. What object will be accessing your interface, how will they be using it, will it matter if they have unneeded access rights to members? those questions and the likes of them you need to ask yourself when working on each project implementation.
I base my decision on the Code's depth within a module.
If I'm writting code that is internal to a module, and does not interface with the outside world I don't encapsulate things with private as much because it affects my programmer performance (how fast I can write and rewrite my code).
But for the objects that server as the module's interface with user code, then I adhere to strict privacy patterns.
Certainly it makes a difference whether your writing internal code or code to be used by someone else (or even by yourself, but as a contained unit.) Any code that is going to be used externally should have a well defined/documented interface that you'll want to change as little as possible.
For internal code, depending on the difficulty, you may find it's less work to do things the simple way now, and pay a little penalty later. Of course Murphy's law will ensure that the short term gain will be erased many times over in having to make wide-ranging changes later on where you needed to change a class' internals that you failed to encapsulate.
Specifically to your example of using a collection that you would return, it seems possible that the implementation of such a collection might change (unlike simpler member variables) making the utility of encapsulation higher.
That being said, I kinda like Python's way of dealing with it. Member variables are public by default. If you want to hide them or add validation there are techniques provided, but those are considered the special cases.
I follow the rules on this almost all the time. There are four scenarios for me - basically, the rule itself and several exceptions (all Java-influenced):
Usable by anything outside of the current class, accessed via getters/setters
Internal-to-class usage typically preceded by 'this' to make it clear that it's not a method parameter
Something meant to stay extremely small, like a transport object - basically a straight shot of attributes; all public
Needed to be non-private for extension of some sort
There's a practical concern here that isn't being addressed by most of the existing answers. Encapsulation and the exposure of clean, safe interfaces to outside code is always great, but it's much more important when the code you're writing is intended to be consumed by a spatially- and/or temporally-large "user" base. What I mean is that if you plan on somebody (even you) maintaining the code well into the future, or if you're writing a module that will interface with code from more than a handful of other developers, you need to think much more carefully than if you're writing code that's either one-off or wholly written by you.
Honestly, I know what wretched software engineering practice this is, but I'll oftentimes make everything public at first, which makes things marginally faster to remember and type, then add encapsulation as it makes sense. Refactoring tools in most popular IDEs these days makes which approach you use (adding encapsulation vs. taking it away) much less relevant than it used to be.