What's the best way to organize code? [closed] - language-agnostic

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I'm not talking about how to indent here. I'm looking for suggestions about the best way of organizing the chunks of code in a source file.
Do you arrange methods alphabetically? In the order you wrote them? Thematically? In some kind of 'didactic' order?
What organizing principles do you follow? Why?

i normally order by the following
constructors
destructors
getters
setters
any 'magic' methods
methods for changing the persisted state of reciever (save() etc)
behaviors
public helper methods
private/protected helper methods
anything else (although if there is anything else its normally a sign that some refactoring is necessary)

I tend to use following pattern:
public static final variables
static functions, static blocks
variables
constructors
functions that do something related to logic
getters and setters (are uninteresing mostly so there is no need to read them)
I'm have no pattern of including local classes, and mostly I put them on top of first method that uses them.
I don't like separating methods depending on access level. If some public method uses some private method they will be close to one another.

I tend to group methods that relate to each other. Use of a good IDE removes much of this concern. Alphabetizing methods seems like a waste of effort to me.

I group them based on what there doing, and then in the order I wrote them (alphabetically would probs be better though)
eg in texture.cpp I have:
//====(DE)CONSTRUCTOR====
...
//====LOAD FUNCTIONS====
...
//====SAVE FUNCTIONS====
...
//====RESOURCE MANGEMENT FUNCTIONS====
//(preventing multiple copies being loaded etc)
...
//====UTILL FUNCTIONS====
//getting texture details, etc
...
//====OVERLOADED OPERTORS====
....

Pretty much use this approach for anything I am coding in. Good structure and well commented code makes good reading
Global Variables
Functions
Main Body/Method

public, protected and then private and within each section alphabetically although I often list constructor first and deconstructor last.
/Allan

I tend to group things thematically for lack of a better word.
For example, if I had a public method that used two private methods in the course of doing its work then I would group all three together in the implementation file since odds are good that if you're going to be looking at one of them then you'll need to look at one of the others.
I also always group get/set methods for a particular class member.
It's really personal preference, especially with modern IDEs since there are a lot of features that allow you to automatically jump to locations in the code.

I like to keep things simple, so I don't stuff a bunch of methods in a class. Within a class, I usually have the most commonly used (or modified-by-me ;-)) methods listed first. As far as specific code organization goes, each set of methods belongs to a class, so it organizes by itself.
I make use of my editor's search feature, and code folding to navigate through large source files. Similarly, I make use of search features to find things in other contexts too. A grand organization scheme never suited me, so I rely on the power of search in all things, not just code.

Interesting point. I hadn't really thought about this.
I tend to put frequently-accessed functions at the top (utility functions, and such), as they're most likely need tweaking.
I don't think organization is particularly important, as I can find any function quickly. I don't scroll through my file to find a function; I search for it.
In C++, I do expect that the functions in the .cpp file are in the same order in which they're declared in the .h file. Which is usually constructors, followed by destructors, followed by primary/central functionality functions, followed by utility functions.

I mostly write C code and I tend to order by dependency. If possible I try to match my source code file with me header files, but generally it's if void a() uses int b(char *foo), than int b(char *foo) comes first.
Saves me from adding entries to the header file for local functions.
For the rest it's mainly alphabetic actually, makes searching easier.

I have all the private fields, then the public, then the constructors, then the main, then the methods that main calls, in the order they are called.

Related

How do you organize your code?

I'm not referring to the indentation or the directory structure but the actual file itself.
Do you arrange your members and methods alphabetically? Maybe in their order of use or order of complex logic (either ascending or descending)? Is there any rhyme to your madness?
I'm thinking about making the switch to alphabetically but some situations would just madden me:
var _height
var _properties
var _width
width and height should definitely be grouped together... but sometimes finding the right method in a larger file can be pretty disheartening.
What do you do?
I tend to be disorganized. I certainly don't arrange my variables alphabetically. And with C#, I am even less organized because some variables I use stuff_like_this, and with properties I might DoItLikeThis. It kind of drives me mad at times, but in the end I like letting the IDE features do the work for me. Visual Studio is incredibly nice, and being able to just right click on a variable to go to its definition, or see everywhere in the code that it's being used is downright awesome. I don't know what IDE you use, but hopefully it's got similar features. In the end, I care less about the organization and really need to worry if my design is fundamentally sound (and that part usually takes me a few tries to get right).
I would suggest not trying to apply any sweeping general rules to your code. Rather, on a case-by-case basis, organize it in a way that makes sense. In the example you give, purely alphabetical would not make sense. In other cases, it would.
Methods: public first, then private. Methods which are related (e.g. getHeight(), getWidth(), getArea()) are placed near each other. If there is a hierarchal relationship between methods, then high-level before the low level (e.g. getArea() uses getWidth() so it's placed before it)
variables: similar to functions, public first, then private. Grouped according to context.
Alphabetical? I don't like it. It can be a nightmare when reading / modifying the code. I cannot tell whether a function's name is sqrt() or calculateSqrt(), so I won't enjoy looking for it. If the functions are organized according to context, it will be much easier to find them.
Generally order the members by their nature (constants, fields, constructors, methods) and then by their visibility (private, protected, public).
For finding the right thing I rely a lot on a modern IDE's capabilities (e.g. Visual Studio + Resharper)
Functions are topologically sorted with the "root" at the bottom of the file.
Variables hardly matter -- if a single "unit" (function, class, etc.) has enough variables for their arrangement to mean anything, that's probably something that needs fixing in itself.
I organize my code is by type of the elements, meaning that all the methods are at one place (at the end of the file, that is), all the constructor in one (before methods), etc. The same goes for variables, constants, enums etc. Usually I place private variables first and others after them, but that's not a strict rule. Also by nature code related to each other often ends up at the same place, but that's just because it is easiest.
I don't bother keeping my code in alphabetical order as I do that easily in the Eclipse outline view. However, if the editor does not support this, then I might use alphabetical order. Well, at least for bigger files.

Difficulty in naming functions [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Anyone else find naming classes and methods one of the most difficult part in programming?
Sometimes it seems i cant really find any name for a function i am writing, can this be because the function is not cohesive enough?
What do you do when no good name for a function comes to mind?
For naming functions, just avoid having simply nouns and rather name them after verbs. Some pointers:
Have function names that are unique visibly, e.g. don't have validateInput() and validateUserInput() since it's hard to say what one does over another. Also, avoid having characters that look very similar, e.g. the number 1 and lowercase 'l'. Sometimes it makes a difference.
Are you working on a project with multiple people? You should spend some time going over naming conventions as well, such as if the function name should have underscores, should be camelCase, etc.
Hungarian notation is a bad idea; avoid doing it.
Think about what the function is doing. The cohesion that you mentioned in your question comes to mind. Generally, functions should do just one thing, so don't name it constructCarAndRunCar() but rather have one function that constructs and another that runs it. If your functions are between, say 20 and 40 lines, you're good.
Sometimes, and this depends on the project, you might also want to prefix your function names with the class if the class is purely procedural (only composed of functions). So if you have a class that takes care of running a simulation, name your functions sim_pauseSimulation() and sim_restartSimulation(). If your class is OOP-based, this isn't an issue as much.
Don't use the underlying data structures in the functions themselves; these should be abstracted away. Rather than having functions like addToVector() or addToArray(), have them be addToList() instead. This is especially true if these are prototypes or the data structures might change later.
Finally, be consistent in your naming conventions. Once you come up with a convention after some thinking, stick to it. PHP comes to mind when thinking of inconsistent function names.
Happy coding! :)
Give it your best-shot and re-factor later if it still doesn't fit.
Sometimes it could be that your function is too large and therefore doing too many things. Try splitting up your function into other functions and it might be clearer what to call each individual function.
Don't worry about naming things with one or two words. Sometimes if functions do something that can be explained in a mini-sentence of sorts, go ahead and name the function a little longer if it'll help other developers understand what is going on.
Another suggestion is to get feedback from others. Often others who come from another perspective and seeing the function for the first time will have a better idea on what to call the function.
I follow following rule: Name according to the purpose (Why? - design decision) and not to the contents (What, How? - can be seen in the code).
For functions it is almost always an action (verb) followed by the noun of parameters and (or results. (Off-topic but for variables do not use "arrayOfNames" or "listOfNames", these are type information but simply "names"). This will also avoid inconsistencies if you refactor the code partly.
For given patterns like object creation, be consistent and always use the same naming like "Create..." (and not sometimes "Allocate..." or "Build..." otherwise you or your collegues will end up in scratching their head wound)
I find it easier to name functions when I don't have to cut back on the words. As long as your not doing javascript for the google start page you can do longer names.
For example you have the method dequeueReusableCellWithIdentifierandmergeChangesFromContextDidSaveNotification in apples cocoa framework.
As long as it's clear what the function is doing you can name it whatever you want and refactor it later.
Almost as important as the function name is that you are consistent with comments. Many IDEs will user your properly formatted comments not only to provide context sensitive help for a function you might be using, but they can be used to generate documentation. This is invaluable when returning to a project after a long period or when working with other developers.
In academic settings, they provide an appreciated demonstration of your intentions.
A good rule of thumb is [verb]returnDescription. This is easy with GetName() type functions and can't be applied universally. It's tough to find a balance between unobtrusive and descriptive code.
Here's a .Net convention guide, but it is applicable to most languages.
Go to www.thesaurus.com and try to find a better suited name though synonyms.
As a practical rule of my own, if a function name is too long, it should be atomized in a new object. Yet, i agree with all posts above. btw, nice noob question

Mandatory method documentation [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
On my previous job, providing all methods with javadoc was mandatory, which resulted in things like:
/**
* Sets the Frobber.
*
* #param frobber The frobber
*/
public setFrobber(Frobber frobber) { ... }
As you can see, the documentation takes up space and work, but adds little to the code.
Should documenting all methods be mandatory or optional? Is there a rule for which methods to document? What are pros and cons of requiring every method to be documented?
"providing all methods with javadoc was mandatory"
I strongly suspect that documenting all methods was mandatory, but providing javadoc comments was all that could be automatically enforced and hence all that was uniformly done.
Personally I think it's better to have no javadoc than completely useless javadoc - at least you can see from a glance at the HTML which methods are undocumented, because there are no descriptions of the parameters etc.
Documentation is frequently underrated, because it always seems less important and urgent when you're writing the code, than it does when you're using it later. But the style and form of documentation is often overrated - auto-generated XML nonsense is still nonsense. Given the choice, I'd rather have the code comment // Sets this object to use the specified frobber for all future frobbing, than your zero-information javadoc.
For all I know from your docs, the function doesn't actually modify this object at all, it might call the set() function on frobber, or it might be while(!frobber.isset()) { refrigerator.add(frobber); sleep(3600); refrigerator.remove(frobber); } Hence it "sets the frobber". I'm sure I read somewhere that "set" is the word with the most distinct definitions in the OED. Brief descriptions are ambiguous and hence misleading, and the purpose of documentation is to stop people relying on your source, and hence on details of your current implementation. My comment doesn't really take any longer to write than it took to add "Sets the frobber" and "the frobber" to the IDE-generated javadoc stub. It doesn't explain what frobbing is or when this object does it (hopefully that's elsewhere in the class docs) but at least it tries to tell you what the function does.
As for when to mandate documentation - I think every interface must be documented. If you're not defining Java interface s, the "interface" is every public and protected method, and every package-protected method unless the package is tiny. Implementation doesn't have to be documented, although it should be commented if the way it works is non-obvious. Documentation might be as simple as the sentence in my comment above - you don't necessarily need a separate sentence for each parameter if the method description already says what they are.
If you have code review, then IMO the answer is to review comments and documentation at the same time. If you don't have code review, then you need a cone of shame for whichever developer most recently forced someone else to come over and ask what the code actually does.
The same applies to anyone who relied on undocumented behaviour of a function, with a result that an implementation change that didn't change the interface, breaks their code. The way you enforce that code be documented, is to complain that you can't call it until you know what it guarantees to do. Arbitrary rules like, "javadoc comments must exist" become less important, at least for functions that other developers need to call.
For big projects or frameworks/libraries or even open source project that you are creating, it is mandatory. For small personal or private projects it is optional. Having said that, it is always a good idea to document your code so if you come back to your project after a year whether small or big, you know what it was doing. This really helps greatly.
You should always document your code. especially if someone else work or will work on your code. Maybe you didn't have a chance yet to work on legacy not-documented code but it can be a real pain!
About the comment itself, one thing to avoid is writing a comment because it is mandatory, Just think a few second and you'll find something to tell about your method, something that's not already in the method name, something that might not be obvious to the next developer. Explain what your method does, what are the corner cases, what it expect as input.
And remember :
Always code as if the guy who ends up
maintaining your code will be a
violent psychopath who knows where you
live.
it applies to comments too :)
It's much easier to maintain "self-documenting" code. If you choose good function and variable names, keep functions short (eg. < 10 lines with only a single idea per function), this will help keep the purpose of the code clear. And you won't have to try to keep the comments up to date - the only thing worse than no comments is comments that are wrong!
There's a good and recent summary of various points of view at InfoQ.
Documentation of code is very important. But Javadoc (or similar tools) are not the only and not the best method for this. The biggest downside is, that Javadoc-documentation must be kept up to date. If the method is changed, but the description stays the same, this documentation can do more trouble than good.
To avoid the problem with documentation not in sync with the code, use code to document. Unit-tests show how your code is used and asserts in the code can ensure that parameters and return-values are validated. In a project I added asserts to a calculation, that the probabilities in this calculation are always between 0 and 1. Later this assert triggered in a use case and pointed me directly to a bug.
The most important documentation is a good naming. If you set a Frobber, then setFrobber is a good name. The Javadoc given in your example adds nothing to this naming. frobIt would be a not so good name, method3 would be very bad. Code reviews should help to get good naming.
Javadocs and ither documentation should be added, if the other methods aren't sufficient. But in this case you need to take care, that this documentation is always up to date.
Q: Should documenting all methods be mandatory or optional?
A: Mandatory.
Q: Is there a rule for which methods to document?
A: All of them.
Q: What are pros and cons of requiring every method to be documented?
A: Pros: Smart people can spend time focusing on code writing, not code figuring-out. Code is well explained. Code can be passed to newbies. Cons: Whining. Stale comments.
A focus on quality commenting obviates the 'code is self-documenting' issues.
In the case of getters and setters, not every get and set is trivial. Sometimes it is, that's great. When it isn't, the comment should note the information. It's better to be conservative and always have comments than unconservative and have to scrap code and waste time figuring it out.
Final example: The Carmack Inverse Square Root code. Self-documenting, eh?

Class member order in source code

This has been asked before (question no. 308581), but that particular question and the answers are a bit C++ specific and a lot of things there are not really relevant in languages like Java or C#.
The thing is, that even after refactorization, I find that there is a bit of mess in my source code files. I mean, the function bodies are alright, but I'm not quite happy with the way the functions themselves are ordered. Of course, in an IDE like Visual Studio it is relatively easy to find a member if you remember how it is called, but this is not always the case.
I've tried a couple of approaches like putting public methods first but that the drawback of this approach is that a function at the top of the file ends up calling an other private function at the bottom of the file so I end up scrolling all the time.
Another approach is to try to group related methods together (maybe into regions) but obviously this has its limits as if there are many non-related methods in the same class then maybe it's time to break up the class to two or more smaller classes.
So consider this: your code has been refactored properly so that it satisfies all the requirements mentioned in Code Complete, but you would still like to reorder your methods for ergonomic purposes. What's your approach?
(Actually, while not exactly a technical problem, this is problem really annoys the hell out of me so I would be really grateful if someone could come up with a good approach)
Actually I totally rely on the navigation functionality of my IDE, i.e. Visual Studio. Most of the time I use F12 to jump to the declaration (or Shift-F12 to find all references) and the Ctrl+- to jump back.
The reason for that is that most of the time I am working on code that I haven't written myself and I don't want to spend my time re-ordering methods and fields.
P.S.: And I also use RockScroll, a VS add-in which makes navigating and scrolling large files quite easy
If you're really having problems scrolling and finding, it's possible you're suffering from god class syndrome.
Fwiw, I personally tend to go with:
class
{
#statics (if any)
#constructor
#destructor (if any)
#member variables
#properties (if any)
#public methods (overrides, etc, first then extensions)
#private (aka helper) methods (if any)
}
And I have no aversion to region blocks, nor comments, so make free use of both to denote relationships.
From my (Java) point of view I would say constructors, public methods, private methods, in that order. I always try to group methods implementing a certain interface together.
My favorite weapon of choice is IntelliJ IDEA, which has some nice possibilities to fold methods bodies so it is quite easy to display two methods directly above each other even when their actual position in the source file is 700 lines apart.
I would be careful with monkeying around with the position of methods in the actual source. Your IDE should give you the ability to view the source in the way you want. This is especially relevant when working on a project where developers can use their IDE of choice.
My order, here it comes.
I usually put statics first.
Next come member variables and properties, a property that accesses one specific member is grouped together with this member. I try to group related information together, for example all strings that contain path information.
Third is the constructor (or constructors if you have several).
After that follow the methods. Those are ordered by whatever appears logical for that specific class. I often group methods by their access level: private, protected, public. But I recently had a class that needed to override a lot of methods from its base class. Since I was doing a lot of work there, I put them together in one group, regardless of their access level.
My recommendation: Order your classes so that it helps your workflow. Do not simply order them, just to have order. The time spent on ordering should be an investment that helps you save more time that you would otherwise need to scroll up and down.
In C# I use #region to seperate those groups from each other, but that is a matter of taste. There are a lot of people who don't like regions. I do.
I place the most recent method I just created on top of the class. That way when I open the project, I'm back at the last method I'm developing. Easier for me to get back "in the zone."
It also reflected the fact that the method(which uses other methods) I just created is the topmost layer of other methods.
Group related functions together, don't be hard-pressed to put all private functions at the bottom. Likewise, imitate the design rationale of C#'s properties, related functions should be in close proximity to each other, the C# language construct for properties reinforces that idea.
P.S.
If only C# can nest functions like Pascal or Delphi. Maybe Anders Hejlsberg can put it in C#, he also invented Turbo Pascal and Delphi :-) D language has nested functions.
A few years ago I spent far too much time pondering this question, and came up with a horrendously complex system for ordering the declarations within a class. The order would depend on the access specifier, whether a method or field was static, transient, volatile etc.
It wasn't worth it. IMHO you get no real benefit from such a complex arrangement.
What I do nowadays is much simpler:
Constructors (default constructor first, otherwise order doesn't matter.)
Methods, sorted by name (static vs. non-static doesn't matter, nor abstract vs. concrete, virtual vs. final etc.)
Inner classes, sorted by name (interface vs. class etc. doesn't matter)
Fields, sorted by name (static vs. non-static doesn't matter.) Optionally constants (public static final) first, but this is not essential.
i pretty sure there was a visual studio addin that could re-order the class members in the code.
so i.e. ctors on the top of the class then static methods then instance methods...
something like that
unfortunately i can't remember the name of this addin! i also think that this addin was for free!
maybe someone other can help us out?
My personal take for structuring a class is as follows:
I'm strict with
constants and static fields first, in alpha order
non-private inner classes and enums in alpha order
fields (and attributes where applicable), in alpha order
ctors (and dtors where applicable)
static methods and factory methods
methods below, in alpha order, regardless of visibility.
I use the auto-formatting capabilities of an IDE at all times. So I'm constantly hitting Ctrl+Shift+F when I'm working. I export auto-formatting capabilities in an xml file which I carry with me everywhere.
It helps down the lane when doing merges and rebases. And it is the type of thing you can automate in your IDE or build process so that you don't have to make a brain cell sweat for it.
I'm not claiming MY WAY is the way. But pick something, configure it, use it consistently until it becomes a reflex, and thus forget about it.

Private vs. Public members in practice (how important is encapsulation?) [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
One of the biggest advantages of object-oriented programming is encapsulation, and one of the "truths" we've (or, at least, I've) been taught is that members should always be made private and made available via accessor and mutator methods, thus ensuring the ability to verify and validate the changes.
I'm curious, though, how important this really is in practice. In particular, if you've got a more complicated member (such as a collection), it can be very tempting to just make it public rather than make a bunch of methods to get the collection's keys, add/remove items from the collection, etc.
Do you follow the rule in general? Does your answer change depending on whether it's code written for yourself vs. to be used by others? Are there more subtle reasons I'm missing for this obfuscation?
It depends. This is one of those issues that must be decided pragmatically.
Suppose I had a class for representing a point. I could have getters and setters for the X and Y coordinates, or I could just make them both public and allow free read/write access to the data. In my opinion, this is OK because the class is acting like a glorified struct - a data collection with maybe some useful functions attached.
However, there are plenty of circumstances where you do not want to provide full access to your internal data and rely on the methods provided by the class to interact with the object. An example would be an HTTP request and response. In this case it's a bad idea to allow anybody to send anything over the wire - it must be processed and formatted by the class methods. In this case, the class is conceived of as an actual object and not a simple data store.
It really comes down to whether or not verbs (methods) drive the structure or if the data does.
As someone having to maintain several-year-old code worked on by many people in the past, it's very clear to me that if a member attribute is made public, it is eventually abused. I've even heard people disagreeing with the idea of accessors and mutators, as that's still not really living up to the purpose of encapsulation, which is "hiding the inner workings of a class". It's obviously a controversial topic, but my opinion would be "make every member variable private, think primarily about what the class has got to do (methods) rather than how you're going to let people change internal variables".
Yes, encapsulation matters. Exposing the underlying implementation does (at least) two things wrong:
Mixes up responsibilities. Callers shouldn't need or want to understand the underlying implementation. They should just want the class to do its job. By exposing the underlying implementation, you're class isn't doing its job. Instead, it's just pushing the responsibility onto the caller.
Ties you to the underlying implementation. Once you expose the underlying implementation, you're tied to it. If you tell callers, e.g., there's a collection underneath, you cannot easily swap the collection for a new implementation.
These (and other) problems apply regardless of whether you give direct access to the underlying implementation or just duplicate all the underlying methods. You should be exposing the necessary implementation, and nothing more. Keeping the implementation private makes the overall system more maintainable.
I prefer to keep members private as long as possible and only access em via getters, even from within the very same class. I also try to avoid setters as a first draft to promote value style objects as long as it is possible. Working with dependency injection a lot you often have setters but no getters, as clients should be able to configure the object but (others) not get to know what's acutally configured as this is an implementation detail.
Regards,
Ollie
I tend to follow the rule pretty strictly, even when it's just my own code. I really like Properties in C# for that reason. It makes it really easy to control what values it's given, but you can still use them as variables. Or make the set private and the get public, etc.
Basically, information hiding is about code clarity. It's designed to make it easier for someone else to extend your code, and prevent them from accidentally creating bugs when they work with the internal data of your classes. It's based on the principle that nobody ever reads comments, especially ones with instructions in them.
Example: I'm writing code that updates a variable, and I need to make absolutely sure that the Gui changes to reflect the change, the easiest way is to add an accessor method (aka a "Setter"), which is called instead of updating data is updated.
If I make that data public, and something changes the variable without going through the Setter method (and this happens every swear-word time), then someone will need to spend an hour debugging to find out why the updates aren't being displayed. The same applies, to a lesser extent, to "Getting" data. I could put a comment in the header file, but odds are that no-one will read it till something goes terribly, terribly wrong. Enforcing it with private means that the mistake can't be made, because it'll show up as an easily located compile-time bug, rather than a run-time bug.
From experience, the only times you'd want to make a member variable public, and leave out Getter and Setter methods, is if you want to make it absolutely clear that changing it will have no side effects; especially if the data structure is simple, like a class that simply holds two variables as a pair.
This should be a fairly rare occurence, as normally you'd want side effects, and if the data structure you're creating is so simple that you don't (e.g a pairing), there will already be a more efficiently written one available in a Standard Library.
With that said, for most small programs that are one-use no-extension, like the ones you get at university, it's more "good practice" than anything, because you'll remember over the course of writing them, and then you'll hand them in and never touch the code again. Also, if you're writing a data structure as a way of finding out about how they store data rather than as release code, then there's a good argument that Getters and Setters will not help, and will get in the way of the learning experience.
It's only when you get to the workplace or a large project, where the probability is that your code will be called to by objects and structures written by different people, that it becomes vital to make these "reminders" strong. Whether or not it's a single man project is surprisingly irrelevant, for the simple reason that "you six weeks from now" is as different person as a co-worker. And "me six weeks ago" often turns out to be lazy.
A final point is that some people are pretty zealous about information hiding, and will get annoyed if your data is unnecessarily public. It's best to humour them.
C# Properties 'simulate' public fields. Looks pretty cool and the syntax really speeds up creating those get/set methods
Keep in mind the semantics of invoking methods on an object. A method invocation is a very high level abstraction that can be implemented my the compiler or the run time system in a variety of different ways.
If the object who's method you are invoking exists in the same process/ memory map then a method could well be optimized by a compiler or VM to directly access the data member. On the other hand if the object lives on another node in a distributed system then there is no way that you can directly access it's internal data members, but you can still invoke its methods my sending it a message.
By coding to interfaces you can write code that doesn't care where the target object exists or how it's methods are invoked or even if it's written in the same language.
In your example of an object that implements all the methods of a collection, then surely that object actually is a collection. so maybe this would be a case where inheritance would be better than encapsulation.
It's all about controlling what people can do with what you give them. The more controlling you are the more assumptions you can make.
Also, theorectically you can change the underlying implementation or something, but since for the most part it's:
private Foo foo;
public Foo getFoo() {}
public void setFoo(Foo foo) {}
It's a little hard to justify.
Encapsulation is important when at least one of these holds:
Anyone but you is going to use your class (or they'll break your invariants because they don't read the documentation).
Anyone who doesn't read the documentation is going to use your class (or they'll break your carefully documented invariants). Note that this category includes you-two-years-from-now.
At some point in the future someone is going to inherit from your class (because maybe an extra action needs to be taken when the value of a field changes, so there has to be a setter).
If it is just for me, and used in few places, and I'm not going to inherit from it, and changing fields will not invalidate any invariants that the class assumes, only then I will occasionally make a field public.
My tendency is to try to make everything private if possible. This keeps object boundaries as clearly defined as possible and keeps the objects as decoupled as possible. I like this because when I have to rewrite an object that I botched the first (second, fifth?) time, it keeps the damage contained to a smaller number of objects.
If you couple the objects tightly enough, it may be more straightforward just to combine them into one object. If you relax the coupling constraints enough you're back to structured programming.
It may be that if you find that a bunch of your objects are just accessor functions, you should rethink your object divisions. If you're not doing any actions on that data it may belong as a part of another object.
Of course, if you're writing a something like a library you want as clear and sharp of an interface as possible so others can program against it.
Fit the tool to the job... recently I saw some code like this in my current codebase:
private static class SomeSmallDataStructure {
public int someField;
public String someOtherField;
}
And then this class was used internally for easily passing around multiple data values. It doesn't always make sense, but if you have just DATA, with no methods, and you aren't exposing it to clients, I find it a quite useful pattern.
The most recent use I had of this was a JSP page where I had a table of data being displayed, defined at the top declaratively. So, initially it was in multiple arrays, one array per data field... this ended in the code being rather difficult to wade through with fields not being next to eachother in definition that would be displayed together... so I created a simple class like above which would pull it together... the result was REALLY readable code, a lot more so than before.
Moral... sometimes you should consider "accepted bad" alternatives if they may make the code simpler and easier to read, as long as you think it through and consider the consequences... don't blindly accept EVERYTHING you hear.
That said... public getters and setters is pretty much equivalent to public fields... at least essentially (there is a tad more flexibility, but it is still a bad pattern to apply to EVERY field you have).
Even the java standard libraries has some cases of public fields.
When I make objects meaningful they are easier to use and easier to maintain.
For example: Person.Hand.Grab(howquick, howmuch);
The trick is not to think of members as simple values but objects in themselves.
I would argue that this question does mix-up the concept of encapsulation with 'information hiding'
(this is not a critic, since it does seem to match a common interpretation of the notion of 'encapsulation')
However for me, 'encapsulation' is either:
the process of regrouping several items into a container
the container itself regrouping the items
Suppose you are designing a tax payer system. For each tax payer, you could encapsulate the notion of child into
a list of children representing the children
a map of to takes into account children from different parents
an object Children (not Child) which would provide the needed information (like total number of children)
Here you have three different kinds of encapsulations, 2 represented by low-level container (list or map), one represented by an object.
By making those decisions, you do not
make that encapsulation public or protected or private: that choice of 'information hiding' is still to be made
make a complete abstraction (you need to refine the attributes of object Children and you may decide to create an object Child, which would keep only the relevant informations from the point of view of a tax payer system)
Abstraction is the process of choosing which attributes of the object are relevant to your system, and which must be completely ignored.
So my point is:
That question may been titled:
Private vs. Public members in practice (how important is information hiding?)
Just my 2 cents, though. I perfectly respect that one may consider encapsulation as a process including 'information hiding' decision.
However, I always try to differentiate 'abstraction' - 'encapsulation' - 'information hiding or visibility'.
#VonC
You might find the International Organisation for Standardization's, "Reference Model of Open Distributed Processing," an interesting read. It defines: "Encapsulation: the property that the information contained in an object is accessible only through interactions at the interfaces supported by the object."
I tried to make a case for information hiding's being a critical part of this definition here:
http://www.edmundkirwan.com/encap/s2.html
Regards,
Ed.
I find lots of getters and setters to be a code smell that the structure of the program is not designed well. You should look at the code that uses those getters and setters, and look for functionality that really should be part of the class. In most cases, the fields of a class should be private implementation details and only the methods of that class may manipulate them.
Having both getters and setters is equal to the field being public (when the getters and setters are trivial/generated automatically). Sometimes it might be better to just declare the fields public, so that the code will be more simple, unless you need polymorphism or a framework requires get/set methods (and you can't change the framework).
But there are also cases where having getters and setters is a good pattern. One example:
When I create the GUI of an application, I try to keep the behaviour of the GUI in one class (FooModel) so that it can be unit tested easily, and have the visualization of the GUI in another class (FooView) which can be tested only manually. The view and model are joined with simple glue code; when the user changes the value of field x, the view calls setX(String) on the model, which in turn may raise an event that some other part of the model has changed, and the view will get the updated values from the model with getters.
In one project, there is a GUI model which has 15 getters and setters, of which only 3 get methods are trivial (such that the IDE could generate them). All the others contain some functionality or non-trivial expressions, such as the following:
public boolean isEmployeeStatusEnabled() {
return pinCodeValidation.equals(PinCodeValidation.VALID);
}
public EmployeeStatus getEmployeeStatus() {
Employee employee;
if (isEmployeeStatusEnabled()
&& (employee = getSelectedEmployee()) != null) {
return employee.getStatus();
}
return null;
}
public void setEmployeeStatus(EmployeeStatus status) {
getSelectedEmployee().changeStatusTo(status, getPinCode());
fireComponentStateChanged();
}
In practice I always follow only one rule, the "no size fits all" rule.
Encapsulation and its importance is a product of your project. What object will be accessing your interface, how will they be using it, will it matter if they have unneeded access rights to members? those questions and the likes of them you need to ask yourself when working on each project implementation.
I base my decision on the Code's depth within a module.
If I'm writting code that is internal to a module, and does not interface with the outside world I don't encapsulate things with private as much because it affects my programmer performance (how fast I can write and rewrite my code).
But for the objects that server as the module's interface with user code, then I adhere to strict privacy patterns.
Certainly it makes a difference whether your writing internal code or code to be used by someone else (or even by yourself, but as a contained unit.) Any code that is going to be used externally should have a well defined/documented interface that you'll want to change as little as possible.
For internal code, depending on the difficulty, you may find it's less work to do things the simple way now, and pay a little penalty later. Of course Murphy's law will ensure that the short term gain will be erased many times over in having to make wide-ranging changes later on where you needed to change a class' internals that you failed to encapsulate.
Specifically to your example of using a collection that you would return, it seems possible that the implementation of such a collection might change (unlike simpler member variables) making the utility of encapsulation higher.
That being said, I kinda like Python's way of dealing with it. Member variables are public by default. If you want to hide them or add validation there are techniques provided, but those are considered the special cases.
I follow the rules on this almost all the time. There are four scenarios for me - basically, the rule itself and several exceptions (all Java-influenced):
Usable by anything outside of the current class, accessed via getters/setters
Internal-to-class usage typically preceded by 'this' to make it clear that it's not a method parameter
Something meant to stay extremely small, like a transport object - basically a straight shot of attributes; all public
Needed to be non-private for extension of some sort
There's a practical concern here that isn't being addressed by most of the existing answers. Encapsulation and the exposure of clean, safe interfaces to outside code is always great, but it's much more important when the code you're writing is intended to be consumed by a spatially- and/or temporally-large "user" base. What I mean is that if you plan on somebody (even you) maintaining the code well into the future, or if you're writing a module that will interface with code from more than a handful of other developers, you need to think much more carefully than if you're writing code that's either one-off or wholly written by you.
Honestly, I know what wretched software engineering practice this is, but I'll oftentimes make everything public at first, which makes things marginally faster to remember and type, then add encapsulation as it makes sense. Refactoring tools in most popular IDEs these days makes which approach you use (adding encapsulation vs. taking it away) much less relevant than it used to be.