Using one method with constants as parameters versus several methods - language-agnostic

In Kent Beck's Implementation Patterns, one can read
"A common use of constants is to
communicate variations of a message in
an interface. For example, to center
text you could invoke
setJustification(Justification.CENTERED).
One advantage of this style of API is
that you can add new variants of
existing methods by adding new
constants without breaking
implementors. However, these messages
don't communicate as well as having a
separate method for each variation. In
this style, the message above would be
justifyCentered(). An interface where
all invocations of a method have
literal constants as arguments can be
improved by giving it separate methods
for each constant value."
Why is this? Generally when I'm coding and I notice that I have a couple of similar parameterless methods that could be reduced to just one, with an argument, like in the following example,
void justifyRight()
void justifyLeft()
void justifyCentered()
I'd generally do just the opposite of what Kent advices, which would be to group it into
setJustification(Justification justification)
How do you usually handle this situation? Is this totally subjective or there is really a very strong reason that I can't see in favour of Kent's view of this matter?
Thanks

File access methods usually have parameters regarding read/write mode, whether to create non-existing files, security attributes, locking modes and so on. Imagine the amount of methods you'd have if you'd create a separate method for each valid combination of parameters!
I've highlighted the biggest argument in favor of separate methods; it's fail-safe because you have strict control over the API. The caller cannot pass in invalid arguments, or invalid combinations of parameters, if you don't expose such parameters. This also implies less complex parameter validation.
However, I'm not in favor of this practice. API's should be well-designed and should change as little as possible. Kent Beck on breaking API changes:
One advantage of [parameterized methods] is that you can add new variants of existing methods by adding new constants without breaking implementors.
His argument in favor of separate methods is:
However, [parameterized methods] don't communicate as well as having a separate method for each variation.
I disagree. Method parameters can be just as readable. Especially in combination with named parameters, a feature which is supported by several languages. Besides, separate methods would result in a cluttered API.

I suppose it's subjective. Some may argue that justifyLeft is clearer than justify(Justification.LEFT) Collapsing it all into one method may result in a nicer API - less clutter - and the mode can be stored in a variable and simply feeding it to the single setXY method (with different methods for each, you'd have to decide which to call depending on the value manually). Therefore I usually prefer this way way. Though it's usually just:
void justify(Justification justification) {
switch(justification) {
Justification.RIGHT: this.justifyRight();
Justification.LEFT: this.justifyLeft();
Justification.CENTERED: this.justifyCenter();
}
}
Of course this is only advisable when all these methods are very closely related.

Related

Is assert in privation function redundant if check has already been made by the calling public function?

Effective java states a good practice of assertions in private methods.
"For an unexported method, you as the package author control the circumstances under which the method is called, so you can and should ensure that only valid parameter values are ever passed in. Therefore, nonpublic methods should generally check their parameters using assertions, as shown below:
For example:
// Private helper function for a recursive sort
private static void sort(long a[]) {
assert a != null;
// Do the computation;
}
My question is would asserts be required even if the public function calling the sort has a null pointer check ?
Example:
public void computeTwoNumbersThatSumToInputValue(int a[], int x) {
if (a == null) {
throw new Nullptrexception();
}
sort(a);
// code to do the required.
}
In other words, will asserts in private function be 'redudant' or mandatory in this case.
Thanks,
It's redundant if you're sure that you've got the assertion in all the calling code. In some cases, that's very obvious - in other cases it can be less so. If you're calling sort from 20 places in the class, are you sure you've checked it in every case?
It's a matter of taste and balance, with no "one size fits all" answer. The balance is in terms of code clarity (both ways!), performance (in extreme cases) and of course safety. It depends on the exact context, and I wouldn't personally like to even guarantee that I'm entirely consistent. (In other words, "level of caffeine at the time of coding" may turn out to be an influence too.)
Note that your assert is only going to execute when assertions are turned on anyway - I personally prefer to validate parameters consistently however you're running the code. I generally use the Preconditions class from Guava to make preconditions unobtrusive.
Assertions will make the helper function sort more robust to use.
Checking for parameters before passing it to any method is a good methodology to have more control over the Exceptions occurring unintentionally at the runtime.
My suggestion will be to use both the approaches in your code as there is no guarantee that all the callers of sort will do such checks. If assertions in helper methods are algorithmically of high order or seems redundant then this can be disabled (esp for production use) via use of -disableassertions or -da from command-line.
You could do that. I will quote from the Oracle docs.
An assertion is a statement in the JavaTM programming language that
enables you to test your assumptions about your program. For example,
if you write a method that calculates the speed of a particle, you
might assert that the calculated speed is less than the speed of
light.
I do not personally use assertions, but from what I gathered readings the oracle docs on it, it enables you to test your assumptions about what you expect something to do. Try/catch blocks are more for failing gracefully as an inevitability of failures bound to happen (like networking, computer problems). Basically, in a perfect world your code would always run successfully because theres nothing wrong with it code wise. But this isn't a perfect world. Also note:
Experience has shown that writing assertions while programming is one
of the quickest and most effective ways to detect and correct bugs. As
an added benefit, assertions serve to document the inner workings of
your program, enhancing maintainability.
I would say use as a preference. To answer your question, I would mainly use it to test code as the docs say, while testing assumptions you have about your code. As the second quote mentions, it has the added benefit of telling other developers (or future you) what you assume to get as parameters. As a personal preference, I leave control flow to try/catch blocks as that is what they were designed for.
*But keep in mind that assertions could be turned off.

Should we avoid to use Object as the input parameter/ output value of a method?

Take Java syntax as an example, though the question itself is language independent. If the following snippet takes an object MyAbstractEmailTemplate as input argument in the method setTemplate, the class MyGateway will then become tightly-coupled with the object MyAbstractEmailTemplate, which lessens the re-usability of the class MyGateway.
A compromise is to use dependency-injection to ease the instantiation of MyAbstractEmailTemplate. This might solve the coupling problem
to some extent, but the interface is still rigid, hardly providing enough flexibility to
other developers/ applications.
So if we only use primitive data type (or even plain XML in web service) as the input/ output of a method, it seems the coupling problem no longer exists. So what do you think?
public class MyGateway {
protected MyAbstractEmailTemplate template;
public void setTemplate(MyAbstractEmailTemplate template) {
this.template = template;
}
}
It's pretty difficult to understand what you are really asking, but going the route of typing everything to Object does not lead to loose coupling because you can't do anything with the input without downcasting, which would break the Liskov Substituion Principle.
Taken to the extreme it leads you here:
public class MyClass
{
public object Invoke(object obj);
}
This is not loose coupling, it's just obscure and hard-to-maintain code.
The name MyAbstractEmailTemplate makes me believe that you are talking about an abstract class.
You should always program against interfaces, so instead of having MyGateway depend on MyAbstractEmailTemplate, it should depend on an EmailTemplate interface, where MyAbstractEmailTemplate implements EmailTemplate. Then, you can pass your custom implementations around as you want to, without further tight coupling.
Combine this with DI and you've got yourself a pretty decent solution.
Not exactly sure what you mean with "the interface is still rigid", but obviously you should design your interface in such a way that it provides the functionality you need.
MyGateway has to assume something about the inputs. Even if it used XML, it would have to assume something about the structure and content of the XML. Coupling isn't an evil in its own right; expresses the contract between two pieces of code. The oft-repeated advice to avoid tight coupling is really just saying that coupling should express the essence of a contract, not more and not less. Passing a specific type (particularly an interface type) is a very good way to achieve this balance.
The first problem you will run into is that a lot of types are simply not representable by a primitive data type (It's a Java problem that there are primitive types at all.).
The coupling should be reduced by using a proper inheritance hierarchy. What means proper? The method should take exactly that part of the interface as a parameter that is need. Not more not less.
After all you won't be able to avoid dependencies. Methods have to know about what they can do with their input or have to able to make assumptions (see C++ concepts) about the capabilities of the input.
IMHO there is nothing inherently wrong in using objects (wth small cap, not Objects) as method parameters and/or class members. Yes, these create dependencies. You can manage this in (at least) two ways:
acknowledge that by creating this dependency, the two classes become tightly coupled. This is entirely appropriate in many cases, where two (or more) classes in fact form a component, which is a meaningful unit of reuse in itself, and its parts may not make much sense or be interchangeable.
if there are multiple interchangeable candidates for a method parameter, these are obvious candidates to form a class hierarchy. Then you program for the interface and can pass any object of any class implementing that interface as parameter to your method. Note that the phrase "there are multiple interchangeable candidates for a method parameter" is a loose rephrasing of the Liskov Substitution Principle, which is the foundation of polymorphism.
in some languages, e.g. C++, the third way would be using templates. Then you need no common interface, only specific methods/members need to resolvable when the template is instantiated. However, since instantiation happens at compile time, this is entirely static binding.
sThe problem is I would say, that the best java can offer are interfaces and people start to see that they are too rigid. It would be interesting to use something like what is in Go language, that allows flexible checking for all methods of an interface to be present in the type, you do not have to be explicit about implementing some interface. We also need something better than interfaces to specify the constraints - maybe some sort of contracts. Another thing is the interface evolution.

Strategy for handling parameter validation in class library

I got a rather big class library that contains a lot of code.
I am looking at how to optimize the performance of some of the code, and for some rather simple utility methods I've found that the parameter validation occupies a rather large portion of the runtime for some core methods.
Let me give a typical example:
A.MethodA1 runs a loop, iterating over a collection, calling B.MethodB1 for each element
B.MethodB1 processes the element and returns the result, it's a rather basic calculation, but since it is used many places, it has been put into its own method instead of being copied and pasted where needed
A.MethodA1 calls C.MethodC1 with the results of B.MethodB1, and puts the result into a list that is returned at the end of the loop
In the case I've found now, B.MethodB1 does rudimentary parameter validation. Since the method calls other internal methods, I'd like to avoid having NullReferenceExceptions several layers deep into the code, and rather fail early, hence B.MethodB1 validates the parameters, like checking for null and some basic range checks on another parameter.
However, in this particular call scenario, it is impossible (due to other program logic) for these parameters to ever have the wrong values. If they had, from the program standpoint, B.MethodB1 would never be called at all for those values, A.MethodA1 would fail before the call to B.MethodB1.
So I was considering removing the parameter validation in B.MethodB1, since it occupies roughly 65% of the method runtime (and this is part of some heavily used code.)
However, B.MethodB1 is a public method, and can thus be called from the program, in which case I want the parameter validation.
So how would you solve this dilemma?
Keep the parameter validation, and take the performance hit
Remove the parameter validation, and have potentially fail-late problems in the method
Split the method into two, one internal that doesn't have parameter validation, called by the "safe" path, and one public that has the parameter validation + a call to the internal version.
The latter one would give me the benefits of having no parameter validation, while still exposing a public entrypoint which does have parameter validation, but for some reason it doesn't sit right with me.
Opinions?
I would go with option 3. I tend to use assertions for private and internal methods and do all the validation in public methods.
By the way, is the performance hit really that big?
That's an interesting question.
Hmmm, makes me think ... "code contracts" .. It would seem like it might be technically possible to statically (at compile time) have certain code contracts be proven to be fulfilled. If this were the case and you had such a compilation validation option you could state these contracts without ever having to validate the conditions at runtime.
It would require that the client code itself be validated against the code contacts.
And, of course it would inevitably be highly dependent on the type of conditions you'd want to write, and it would probably only be feasible to prove these contracts to a certain point (how far up the possible call graph would you go?). Beyond this point the validator might have to beg off, and insist that you place a runtime check (or maybe a validation warning suppression?).
All just idle speculation. Does make me wonder a bit more about C# 4.0 code contracts. I wonder if these have support for static analysis. Have you checked them out? I've been meaning to, but learning F# is having to take priority at the moment!
Update:
Having read up a little on it, it appears that C# 4.0 does indeed have a 'static checker' as well as a binary rewriter (which takes care of altering the output binary so that pre and post condition checks are in the appropriate location)
What's not clear from my extremely quick read, is whether you can opt out of the binary rewriting - what I'm thinking here is that what you'd really be looking for is to use the code contracts, have the metadata (or code) for the contracts maintained within the various assemblies but use only the static checker for at least a selected subset of contracts, so that you in theory get proven safety without any runtime hit.
Here's a link to an article on the code contracts

Doesn't Passing in Parameters that Should Be Known Implicitly Violate Encapsulation?

I often hear around here from test driven development people that having a function get large amounts of information implicitly is a bad thing. I can see were this would be bad from a testing perspective, but isn't it sometimes necessary from an encapsulation perspective? The following question comes to mind:
Is using Random and OrderBy a good shuffle algorithm?
Basically, someone wanted to create a function in C# to randomly shuffle an array. Several people told him that the random number generator should be passed in as a parameter. This seems like an egregious violation of encapsulation to me, even if it does make testing easier. Isn't the fact that an array shuffling algorithm requires any state at all other than the array it's shuffling an implementation detail that the caller should not have to care about? Wouldn't the correct place to get this information be implicitly, possibly from a thread-local singleton?
I don't think it breaks encapsulation. The only state in the array is the data itself - and "a source of randomness" is essentially a service. Why should an array naturally have an associated source of randomness? Why should that have to be a singleton? What about different situations which have different requirements - e.g. speed vs cryptographically secure randomness? There's a reason why java.util.Random has a SecureRandom subclass :) Perhaps it doesn't matter whether the shuffle's results are predictable with a lot of effort and observation - or perhaps it does. That will depend on the context, and that's information that the shuffle algorithm shouldn't care about.
Once you start thinking of it as a service, it makes sense that it's passed in as a dependency.
Yes, you could get it from a thread-local singleton (and indeed I'm going to blog about exactly that in the next few days) but I would generally code it so that the caller gets to make that decision.
One benefit of the "randomness as a service" concept is that it makes for repeatability - if you've got a test which fails, you can pass in a Random with a specific seed and know you'll always get the same results, which makes debugging easier.
Of course, there's always the option of making the Random optional - use a thread-local singleton as a default if the caller doesn't provide their own.
Yes, that does break encapsulation. As with most software design decisions, this is a trade-off between two opposing forces. If you encapsulate the RNG then you make it difficult to change for a unit test. If you make it a parameter then you make it easy for a user to change the RNG (and potentially get it wrong).
My personal preference is to make it easy to test, then provide a default implementation (a default constructor that creates its own RNG, in this particular case) and good documentation for the end user. Adding a method with the signature
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source)
that creates a Random using the current system time as its seed would take care of most normal use cases of this method. The original method
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source, Random rng)
could be used for testing (pass in a Random object with a known seed) and also in those rare cases where a user decides they need a cryptographically secure RNG. The one-parameter implementation should call this method.
I don't think this violates encapsulation.
Your Example
I would say that being able to provide an RNG is a feature of the class. I would obviously provide a method that doesn't require it, but I can see times where it may be useful to be able to duplicate the randomization.
What if the array shuffler was part of a game that used the RNG for level generation. If a user wanted to save the level and play it again later, it may be more efficient to store the RNG seed.
General Case
Simple classes that have a single task like this typically don't need to worry about divulging their inner workings. What they encapsulate is the logic of the task, not the elements required by that logic.

Should I use an interface like IEnumerable, or a concrete class like List<>

I recently expressed my view about this elsewhere* , but I think it deserves further analysis so I'm posting this as its own question.
Let's say that I need to create and pass around a container in my program. I probably don't have a strong opinion about one kind of container versus another, at least at this stage, but I do pick one; for sake of argument, let's say I'm going to use a List<>.
The question is: Is it better to write my methods to accept and return a high level interface such as C#'s IEnumerable? Or should I write methods to take and pass the specific container class that I have chosen.
What factors and criteria should I look for to decide? What kind of programs work benefit from one or the other? Does the computer language affect your decision? Performance? Program size? Personal style?
(Does it even matter?)
**(Homework: find it. But please post your answer here before you look for my own, so as not bias you.)*
Your method should always accept the least-specific type it needs to execute its function. If your method needs to enumerate, accept IEnumerable. If it needs to do IList<>-specific things, by definition you must give it a IList<>.
The only thing that should affect your decision is how you plan to use the parameter. If you're only iterating over it, use IEnumerable<T>. If you are accessing indexed members (eg var x = list[3]) or modifying the list in any way (eg list.Add(x)) then use ICollection<T> or IList<T>.
There is always a tradeoff. The general rule of thumb is to declare things as high up the hierarchy as possible. So if all you need is access to the methods in IEnumerable then that is what you should use.
Another recent example of a SO question was a C API that took a filename instead of a File * (or file descriptor). There the filename severly limited what sores of things could be passed in (there are many things you can pass in with a file descriptor, but only one that has a filename).
Once you have to start casting you have either gone too high OR you should be making a second method that takes a more specific type.
The only exception to this that I can think of is when speed is an absolute must and you do not want to go through the expense of a virtual method call. Declaring the specific type removes the overhead of virtual functions (will depend on the language/environment/implementation, but as a general statement that is likely correct).
It was a discussion with me that prompted this question, so Euro Micelli already knows my answer, but here it is! :)
I think Linq to Objects already provides a great answer to this question. By using the simplest interface to a sequence of items it could, it gives maximum flexibility about how you implement that sequence, which allows lazy generation, boosting productivity without sacrificing performance (not in any real sense).
It is true that premature abstraction can have a cost - but mainly it is the cost of discovering/inventing new abstractions. But if you already have perfectly good ones provided to you, then you'd be crazy not to take advantage of them, and that is what the generic collection interfaces provides you with.
There are those who will tell you that it is "easier" to make all the data in a class public, just in case you will need to access it. In the same way, Euro advised that it would be better to use a rich interface to a container such as IList<T> (or even the concrete class List<T>) and then clean up the mess later.
But I think, just as it is better to hide the data members of a class that you don't want to access, to allow you to modify the implementation of that class easily later, so you should use the simplest interface available to refer to a sequence of items. It is easier in practice to start by exposing something simple and basic and then "loosen" it later, than it is to start with something loose and struggle to impose order on it.
So assume IEnumerable<T> will do to represent a sequence. Then in those cases where you need to Add or Remove items (but still don't need by-index lookup), use IContainer<T>, which inherits IEnumerable<T> and so will be perfectly interoperable with your other code.
This way it will be perfectly clear (just from local examination of some code) precisely what that code will be able to do with the data.
Small programs require less abstraction, it is true. But if they are successful, they tend to become big programs. This is much easier if they employ simple abstractions in the first place.
It does matter, but the correct solution completely depends on usage. If you only need to do a simple enumeration then sure use IEnumerable that way you can pass any implementer to access the functionality you need. However if you need list functionality and you don't want to have to create a new instance of a list if by chance every time the method is called the enumerable that was passed wasn't a list then go with a list.
I answered a similar C# question here. I think you should always provide the simplest contract you can, which in the case of collections in my opinion, ordinarily is IEnumerable Of T.
The implementation can be provided by an internal BCL type - be it Set, Collection, List etcetera - whose required members are exposed by your type.
Your abstract type can always inherit simple BCL types, which are implemented by your concrete types. This in my opinion allows you to adhere to LSP easier.