'traditional programming language' equivalent of dynamic SQL - json

I come from a database programming background (e.g., Oracle PL/SQL) and had to learn some typescript for a recent project due to the difficulty in hiring web developers. I did ok - it was very clunky back-endish type code I was writing. It just had to work and it did.
I feel like there's a programming modality familiar to the world of databases that I can't see how to accomplish in Typescript. Consider this working fragment of code to evaluate a business rule:
public br_12(contractBid: WFO.ContractBid) {
const condition1 = (contractBid.n_tax_per <= 0.003);
const condition2 = (this.br_12(contractBid) === 0 && this.br_18(contractBid)) === 0 && contractBid.n_tax_per <= 0.005;
if (condition1 || condition2)
{
return Math.ceil ( ( this.br_2(contractBid) * contractBid.n_tax_per) * 100 / 100 ) ;
} else {
return 0;
}
}
I would like to be able to store the business rules, specifically the conditions, externally in a JSON file - or other external data source - and sometimes reference them arbitrarily and programmatically. And be able to update them without "changing any code" - of course I know the app would have to be rebuilt and redeployed when they change. So to write very rough sample code:
{
"business rules conditions": {
"condition1": "(contractBid.n_tax_per <= 0.003)",
"condition2": "(this.br_12(contractBid) === 0 && this.br_18(contractBid)) === 0 && contractBid.n_tax_per <= 0.005"
}
}
public br_12(contractBid: WFO.ContractBid) {
if (this.external.condition1 || this.external.condition2)
{
return Math.ceil ( ( this.br_2(contractBid) * contractBid.n_tax_per) * 100 / 100 ) ;
} else {
return 0;
}
}
Nevermind that I'm crazy to want to do such a thing - is there a way to do it?

Unfortunately JavaScript / TypeScript treats data as non executable.
So you would need to store your rules as strings, and convert them to code e.g. using:
eval, which is considered bad practice (because it is an easy attack vector target)
text parsing, i.e. build a mini syntax / Domain Specific Language
But if I understand correctly, in the first place you want to separate the definition of the business rules from where they are used. Maybe so that they are all gathered in a single place, it is easier to review and maintain them, etc.
In that case, there are many other possible solutions:
make each business rule a simple function, all exported in a single module
methods of a singleton class (essentially similar to previous point, but can be more familiar to OOP developers)
You could even consider ignoring that file from source control (in case you want to swap it depending on the environment, "without changing any code"), although for such use case, since it involves code, we would rather commit all different code versions, and import the appropriate module depending on an environment variable only (e.g. using a tsconfig import path alias).

Actually, I just found this...about using the transpile function with eval. Seems to be 'kosher' enough? Eval not going to be a huge concern as an attack vector because this is a very closed access business system, not a public website. And anyhow the configuration script/JSON/sharepoint list will even more limited in access.
Evaluate Typescript from string?

Related

What is the benefit in removing parentheses when using Kotlin lambda expressions?

Consider this example given here: https://stackoverflow.com/a/53376287/11819720
fun onClick(action: () -> Unit) { ... }
view.onClick({ toast(it.toString())} )
view.onClick() {toast(it.toString()) }
view.onClick { toast(it.toString()) }
and this code:
fun convert(x: Double, converter: (Double) -> Double): Double{
val result = converter(x)
println("$x is converted to $result")
return result
}
fun convertFive (converter: (Int) -> double): Double{
val result = converter(5)
println(" 5 is converted to $result")
return result
}
fun main (args: Array<String>){
convert(20.0) { it * 1.8 + 32 }
convertFive { it * 1.8 + 32 }
}
It seems the effort is to save on writing a few parentheses at the risk of confusing a reader.
Would it not be better if it was kept standard? The usual way to use the functions in the second example would be:
convert(20, {it * 1.8 + 32})
convertFive({it * 1.8 = 32})
but IntelliJ will complain and suggest moving the lambda out of the parenthesis. Why? Is there a practical benefit? Seems like writing 10 + 2 * 5 / 34 instead of 10 + (2 * 5) / 34.
The real benefit of the trailing-braces lambda is that, like all the best language features, it changes the way you think.
The examples you have provided are just as well written with parentheses, and I agree that the IntelliJ suggestion to use the trailing form all the time is unnecessary...
But when I write something like this:
with(someObject) {
doSomething();
doSomethingElse();
}
It looks like I'm using a cool new feature of the language even though I'm really just calling a function.
The result is that people start thinking like they can write functions that add things to the language, because they kinda can, and that leads them to create new ways of doing things for the people who use their code.
The type-safe builder pattern is a great example of this. A lot of the Kotlin language features work together so that, even though it's just calling functions and passing lambdas, it provides a new experience for the developers that use it.
They get a whole new way to instantiate complex objects that is much more natural than the old ways, and nothing needed to be added to the language just for that -- all the little building-block features can be used for many other things.
There is no practical benefit at all, it is just a convention.
In regards to what you said about "keeping it standard". Where exactly did you get the "usual way" from? There are no global programming conventions that I am aware of, only language-specific ones, so this notation is standard by definition.
Conventions are important as they make reading code a lot less effort for anyone also familiar with the syntax. The conventions also reflect the usage of the language. With Kotlin they promote a very functional style with heavy use of lambdas and inline functions so 'lambdas out of parentheses' is necessary to keep the code clean and explicit.
Also as #Tenfour04 said in the comments, your examples really don't reflect the intended usage of the syntax. Generally you have multiple lines and even if you don't, the pattern is supposed to convey something more. Take the measureTimeMillis function for example:
measureTimeMillis {
askQuestion()
comment()
answerQuestion()
}
By having the lambda outside of the parenthesis it is immediately evident what the function does, even to a non-technical reader which is exactly what conventions are there for.
Closer to your example. Let's say you need to convert an array of numbers and square all the positive ones. Compare what is easier to read:
val result = arrayOf(1.0, 2.0, -3.0).map({ number ->
convert(number, {
if (it > 0) it * it else it
})
})
val result = arrayOf(1.0, 2.0, -3.0).map { number ->
convert(number) {
if (it > 0) it * it else it
}
}

A tool to detect unnecessary recursive calls in a program?

A very common beginner mistake when writing recursive functions is to accidentally fire off completely redundant recursive calls from the same function. For example, consider this recursive function that finds the maximum value in a binary tree (not a binary search tree):
int BinaryTreeMax(Tree* root) {
if (root == null) return INT_MIN;
int maxValue = root->value;
if (maxValue < BinaryTreeMax(root->left))
maxValue = BinaryTreeMax(root->left); // (1)
if (maxValue < BinaryTreeMax(root->right))
maxValue = BinaryTreeMax(root->right); // (2)
return maxValue;
}
Notice that this program potentially makes two completely redundant recursive calls to BinaryTreeMax in lines (1) and (2). We could rewrite this code so that there's no need for these extra calls by simply caching the value from before:
int BinaryTreeMax(Tree* root) {
if (root == null) return INT_MIN;
int maxValue = root->value;
int leftValue = BinaryTreeMax(root->left);
int rightValue = BinaryTreeMax(root->right);
if (maxValue < leftValue)
maxValue = leftValue;
if (maxValue < rightValue)
maxValue = rightValue;
return maxValue;
}
Now, we always make exactly two recursive calls.
My question is whether there is a tool that does either a static or dynamic analysis of a program (in whatever language you'd like; I'm not too picky!) that can detect whether a program is making completely unnecessary recursive calls. By "completely unnecessary" I mean that
The recursive call has been made before,
by the same invocation of the recursive function (or one of its descendants), and
the call itself has no observable side-effects.
This is something that can usually be determined by hand, but I think it would be great if there were some tool that could flag things like this automatically as a way of helping students gain feedback about how to avoid making simple but expensive mistakes in their programs that could contribute to huge inefficiencies.
Does anyone know of such a tool?
First, your definition of 'completely unnecessary' is insufficient. It is possible that some code between the two function calls affects the result of the second function call.
Second, this has nothing to do with recursion, the same question can apply to any function call. If it has been called before with the exact same parameters, has no side-effects, and no code between the two calls changed any data the function accesses.
Now, I'm pretty sure a perfect solution is impossible, as it would solve The Halting Problem, but that doesn't mean there isn't a way to detect enough of these cases and optimize away some of them.
Some compilers know how to do that (GCC has a specific flag that warns you when it does so). Here's a 2003 article I found about the issue: http://www.cs.cmu.edu/~jsstylos/15745/final.pdf .
I couldn't find a tool for this, though, but that's probably something Eric Lipert knows, if he happens to bump into your question.
Some compilers (such as GCC) do have ways to mark determinate functions explicitly (to be more precise, __attribute__((const)) (see GCC function attributes) applies some restrictions onto the function body to make its result depend only from its argument and get no depency from shared state of program or other non-deterministic functions). Then they eliminate duplicate calls to costy functions. Some other high-level language implementations (may be Haskell) does this tests automatically.
Really, I don't know tools for such analysis (but if i find it i will be happy). And if there is one that correcly detects unnecessary recursion or, in general way, function evaluation (in language-agnostic environment) it would be a kind of determinacy prover.
BTW, it's not so difficult to write such program when you already have access to semantic tree of the code :)

should the builder reset its build environment after delivering the product

I am implementing a builder where in the deliverable is retrieved by calling Builder::getProduct() . The director asks various parts to build Builder::buildPartA() , Builder::buildPartB() etc. in order to completely build the product.
My question is, once the product is delivered by the Builder by calling Builder::getProduct(), should it reset its environment (Builder::partA = NULL;, Builder::partB = NULL;) so that it is ready to build another product? (with same or different configuration?)
I ask this as I am using PHP wherein the objects are by default passed by reference (nope, I don't want to clone them, as one of their field is a Resource) . However, even if you think from a language agnostic point of view, should the Builder reset its build environment ? If your answer is 'it depends on the case' what use cases would justify reseting the environment (and other way round) ?
For the sake of providing code sample here's my Builder::gerProcessor() which shows what I mean by reseting the environment
/**
* #see IBuilder::getProessor()
*/
public function getProcessor()
{
if($this->_processor == NULL) {
throw new LogicException('Processor not yet built!');
} else {
$retval = $this->_processor;
$this->_product = NULL, $this->_processor = NULL;
}
return $retval;
}
Resetting the state in getProcessor() is non-obvious and if you want to do that the method should reflect that in it's name, e.g. getProcessorAndReset(). A cleaner solution would be to just give the builder a separate reset() method.
In general, your getProcessor() should not reset it's internal state because methods should not magically change behavior but reliably do the same. getProcessor() is a query and that query should return the same configured Processor on each call. It should not change state. Resetting the state is a command. You want to separate command and query methods.

What is the best way to replace or substitute if..else if..else trees in programs?

This question is motivated by something I've lately started to see a bit too often, the if..else if..else structure. While it's simple and has its uses, something about it keeps telling me again and again that it could be substituted with something that's more fine-grained, elegant and just generally easier to keep up-to-date.
To be as specific as possible, this is what I mean:
if (i == 1) {
doOne();
} else if (i == 2) {
doTwo();
} else if (i == 3) {
doThree();
} else {
doNone();
}
I can think of two simple ways to rewrite that, either by ternary (which is just another way of writing the same structure):
(i == 1) ? doOne() :
(i == 2) ? doTwo() :
(i == 3) ? doThree() : doNone();
or using Map (in Java and I think in C# too) or Dictionary or any other K/V structure like this:
public interface IFunctor() {
void call();
}
public class OneFunctor implemets IFunctor() {
void call() {
ref.doOne();
}
}
/* etc. */
Map<Integer, IFunctor> methods = new HashMap<Integer, IFunctor>();
methods.put(1, new OneFunctor());
methods.put(2, new TwoFunctor());
methods.put(3, new ThreeFunctor());
/* .. */
(methods.get(i) != null) ? methods.get(i).call() : doNone();
In fact the Map method above is what I ended up doing last time but now I can't stop thinking that there has to be better alternatives in general for this exact issue.
So, which other -and most likely better- ways to replace the if..else if..else are out there and which one is your favorite?
Your thoughts below this line!
Okay, here are your thoughts:
First, most popular answer was switch statement, like so:
switch (i) {
case 1: doOne(); break;
case 2: doTwo(); break;
case 3: doThree(); break;
default: doNone(); break;
}
That only works for values which can be used in switches, which at least in Java is quite a limiting a factor. Acceptable for simple cases though, naturally.
The other and perhaps a bit fancier way you seem to sugges is to do it using polymorphism. The Youtube lecture linked by CMS is an excellent watch, go see it here: "The Clean Code Talks -- Inheritance, Polymorphism, & Testing" As far as I understood, this would translate to something like this:
public interface Doer {
void do();
}
public class OneDoer implements Doer {
public void do() {
doOne();
}
}
/* etc. */
/* some method of dependency injection like Factory: */
public class DoerFactory() {
public static Doer getDoer(int i) {
switch (i) {
case 1: return new OneDoer();
case 2: return new TwoDoer();
case 3: return new ThreeDoer();
default: return new NoneDoer();
}
}
}
/* in actual code */
Doer operation = DoerFactory.getDoer(i);
operation.do();
Two interesting points from the Google talk:
Use Null Objects instead of returning nulls (and please throw only Runtime Exceptions)
Try to write a small project without if:s.
Also in addition one post worth mentioning in my opinion is CDR who provided his perverse habits with us and while not recommended to use, it's just very interesting to look at.
Thank you all for the answers (so far), I think I might have learned something today!
These constructs can often be replaced by polymorphism. This will give you shorter and less brittle code.
In Object Oriented languages, it's common to use polymorphism to replace if's.
I liked this Google Clean Code Talk that covers the subject:
The Clean Code Talks -- Inheritance, Polymorphism, & Testing
ABSTRACT
Is your code full of if statements?
Switch statements? Do you have the
same switch statement in various
places? When you make changes do you
find yourself making the same change
to the same if/switch in several
places? Did you ever forget one?
This talk will discuss approaches to
using Object Oriented techniques to
remove many of those conditionals. The
result is cleaner, tighter, better
designed code that's easier to test,
understand and maintain.
A switch statement:
switch(i)
{
case 1:
doOne();
break;
case 2:
doTwo();
break;
case 3:
doThree();
break;
default:
doNone();
break;
}
Depending on the type of thing you are if..else'ing, consider creating a hierarchy of objects and using polymorphism. Like so:
class iBase
{
virtual void Foo() = 0;
};
class SpecialCase1 : public iBase
{
void Foo () {do your magic here}
};
class SpecialCase2 : public iBase
{
void Foo () {do other magic here}
};
Then in your code just call p->Foo() and the right thing will happen.
There's two parts to that question.
How to dispatch based on a value? Use a switch statement. It displays your intent most clearly.
When to dispatch based on a value? Only at one place per value: create a polymorphic object that knows how to provide the expected behavior for the value.
The switch statement of course, much prettier then all those if's and else's.
Outside of using a switch statement, which can be faster, none. If Else is clear and easy to read. having to look things up in a map obfuscates things. Why make code harder to read?
switch (i) {
case 1: doOne(); break;
case 2: doTwo(); break;
case 3: doThree(); break;
default: doNone(); break;
}
Having typed this, I must say that there is not that much wrong with your if statement. Like Einstein said: "Make it as simple as possible, but no simpler".
I use the following short hand just for fun! Don't try anyof these if code clearity concerns you more than the number of chars typed.
For cases where doX() always returns true.
i==1 && doOne() || i==2 && doTwo() || i==3 && doThree()
Ofcourse I try to ensure most void functions return 1 simply to ensure that these short hands are possible.
You can also provide assignments.
i==1 && (ret=1) || i==2 && (ret=2) || i==3 && (ret=3)
Like instad of writting
if(a==2 && b==3 && c==4){
doSomething();
else{
doOtherThings();
}
Write
a==2 && b==3 && c==4 && doSomething() || doOtherThings();
And in cases, where not sure what the function will return, add an ||1 :-)
a==2 && b==3 && c==4 && (doSomething()||1) || doOtherThings();
I still find it faster to type than using all those if-else and it sure scares all new noobs out. Imagine a full page of statement like this with 5 levels of indenting.
"if" is rare in some of my codes and I have given it the name "if-less programming" :-)
In this simple case you could use a switch.
Otherwise a table-based approach looks fine, it would be my second choice whenever the conditions are regular enough to make it applicable, especially when the number of cases is large.
Polymorphism would be an option if there are not too many cases, and conditions and behaviour are irregular.
The example given in the question is trivial enough to work with a simple switch. The problem comes when the if-elses are nested deeper and deeper. They are no longer "clear or easy to read," (as someone else argued) and adding new code or fixing bugs in them becomes more and more difficult and harder to be sure about because you might not end up where you expected if the logic is complex.
I've seen this happen lots of times (switches nested 4 levels deep and hundreds of lines long--impossible to maintain), especially inside of factory classes that are trying to do too much for too many different unrelated types.
If the values you're comparing against are not meaningless integers, but some kind of unique identifier (i.e. using enums as a poor man's polymorphism), then you want to use classes to solve the problem. If they really are just numeric values, then I would rather use separate functions to replace the contents of the if and else blocks, and not design some kind of artificial class hierarchy to represent them. In the end that can result in messier code than the original spaghetti.
Use a switch/case it's cleaner :p
switch statement or classes with virtual functions as fancy solution. Or array of pointers to functions. It's all depends on how complex conditions are, sometimes there's no way around those if's. And again, creating series of classes to avoid one switch statement is clearly wrong, code should be as simple as possible (but not simpler)
I would go so far as to say that no program should ever use else. If you do you are asking for trouble. You should never assume if it's not an X it must be a Y. Your tests should test for each individually and fail following such tests.
In OO paradigm you could do it using good old polymorphism. Too big if - else structures or switch constructs are sometimes considered a smell in the code.
The Map method is about the best there is. It lets you encapsulate the statements and breaks things up quite nicely. Polymorphism can complement it, but its goals are slightly different. It also introduces unnecessary class trees.
Switches have the drawback of missing break statements and fall through, and really encourage not breaking the problem into smaller pieces.
That being said: A small tree of if..else's is fine (in fact, i argued in favor for days about have 3 if..elses instead of using Map recently). Its when you start to put more complicated logic in them that it becomes a problem due to maintainability and readability.
In python, I would write your code as:
actions = {
1: doOne,
2: doTwo,
3: doThree,
}
actions[i]()
I regard these if-elseif-... constructs as "keyword noise". While it may be clear what it does, it is lacking in conciseness; I regard conciseness as an important part of readability. Most languages provide something like a switch statement. Building a map is a way to get something similar in languages that do not have such, but it certainly feels like a workaround, and there is a bit of overhead (a switch statement translates to some simple compare operations and conditional jumps, but a map first is built in memory, then queried and only then the compare and jump takes place).
In Common Lisp, there are two switch constructs built in, cond and case. cond allows arbitrary conditionals, while case only tests for equality, but is more concise.
(cond ((= i 1)
(do-one))
((= i 2)
(do-two))
((= i 3)
(do-three))
(t
(do-none)))
(case i
(1 (do-one))
(2 (do-two))
(3 (do-three))
(otherwise (do-none)))
Of course, you could make your own case-like macro for your needs.
In Perl, you can use the for statement, optionally with an arbitrary label (here: SWITCH):
SWITCH: for ($i) {
/1/ && do { do_one; last SWITCH; };
/2/ && do { do_two; last SWITCH; };
/3/ && do { do_three; last SWITCH; };
do_none; };
Use a Ternary Operator!
Ternary Operator(53Characters):
i===1?doOne():i===2?doTwo():i===3?doThree():doNone();
If(108Characters):
if (i === 1) {
doOne();
} else if (i === 2) {
doTwo();
} else if (i === 3) {
doThree();
} else {
doNone();
}
Switch((EVEN LONGER THAN IF!?!?)114Characters):
switch (i) {
case 1: doOne(); break;
case 2: doTwo(); break;
case 3: doThree(); break;
default: doNone(); break;
}
this is all you need! it is only one line and it is pretty neat, way shorter than switch and if!
Naturally, this question is language-dependent, but a switch statement might be a better option in many cases. A good C or C++ compiler will be able to generate a jump table, which will be significantly faster for large sets of cases.
If you really must have a bunch of if tests and want to do different things whenwver a test is true I would recommend a while loop with only ifs- no else. Each if does a test an calls a method then breaks out of the loop. No else there's nothing worse than a bunch of stacked if/else/if/else etc.

Should a function have only one return statement?

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Are there good reasons why it's a better practice to have only one return statement in a function?
Or is it okay to return from a function as soon as it is logically correct to do so, meaning there may be many return statements in the function?
I often have several statements at the start of a method to return for "easy" situations. For example, this:
public void DoStuff(Foo foo)
{
if (foo != null)
{
...
}
}
... can be made more readable (IMHO) like this:
public void DoStuff(Foo foo)
{
if (foo == null) return;
...
}
So yes, I think it's fine to have multiple "exit points" from a function/method.
Nobody has mentioned or quoted Code Complete so I'll do it.
17.1 return
Minimize the number of returns in each routine. It's harder to understand a routine if, reading it at the bottom, you're unaware of the possibility that it returned somewhere above.
Use a return when it enhances readability. In certain routines, once you know the answer, you want to return it to the calling routine immediately. If the routine is defined in such a way that it doesn't require any cleanup, not returning immediately means that you have to write more code.
I would say it would be incredibly unwise to decide arbitrarily against multiple exit points as I have found the technique to be useful in practice over and over again, in fact I have often refactored existing code to multiple exit points for clarity. We can compare the two approaches thus:-
string fooBar(string s, int? i) {
string ret = "";
if(!string.IsNullOrEmpty(s) && i != null) {
var res = someFunction(s, i);
bool passed = true;
foreach(var r in res) {
if(!r.Passed) {
passed = false;
break;
}
}
if(passed) {
// Rest of code...
}
}
return ret;
}
Compare this to the code where multiple exit points are permitted:-
string fooBar(string s, int? i) {
var ret = "";
if(string.IsNullOrEmpty(s) || i == null) return null;
var res = someFunction(s, i);
foreach(var r in res) {
if(!r.Passed) return null;
}
// Rest of code...
return ret;
}
I think the latter is considerably clearer. As far as I can tell the criticism of multiple exit points is a rather archaic point of view these days.
I currently am working on a codebase where two of the people working on it blindly subscribe to the "single point of exit" theory and I can tell you that from experience, it's a horrible horrible practice. It makes code extremely difficult to maintain and I'll show you why.
With the "single point of exit" theory, you inevitably wind up with code that looks like this:
function()
{
HRESULT error = S_OK;
if(SUCCEEDED(Operation1()))
{
if(SUCCEEDED(Operation2()))
{
if(SUCCEEDED(Operation3()))
{
if(SUCCEEDED(Operation4()))
{
}
else
{
error = OPERATION4FAILED;
}
}
else
{
error = OPERATION3FAILED;
}
}
else
{
error = OPERATION2FAILED;
}
}
else
{
error = OPERATION1FAILED;
}
return error;
}
Not only does this make the code very hard to follow, but now say later on you need to go back and add an operation in between 1 and 2. You have to indent just about the entire freaking function, and good luck making sure all of your if/else conditions and braces are matched up properly.
This method makes code maintenance extremely difficult and error prone.
Structured programming says you should only ever have one return statement per function. This is to limit the complexity. Many people such as Martin Fowler argue that it is simpler to write functions with multiple return statements. He presents this argument in the classic refactoring book he wrote. This works well if you follow his other advice and write small functions. I agree with this point of view and only strict structured programming purists adhere to single return statements per function.
As Kent Beck notes when discussing guard clauses in Implementation Patterns making a routine have a single entry and exit point ...
"was to prevent the confusion possible
when jumping into and out of many
locations in the same routine. It made
good sense when applied to FORTRAN or
assembly language programs written
with lots of global data where even
understanding which statements were
executed was hard work ... with small methods and mostly local data, it is needlessly conservative."
I find a function written with guard clauses much easier to follow than one long nested bunch of if then else statements.
In a function that has no side-effects, there's no good reason to have more than a single return and you should write them in a functional style. In a method with side-effects, things are more sequential (time-indexed), so you write in an imperative style, using the return statement as a command to stop executing.
In other words, when possible, favor this style
return a > 0 ?
positively(a):
negatively(a);
over this
if (a > 0)
return positively(a);
else
return negatively(a);
If you find yourself writing several layers of nested conditions, there's probably a way you can refactor that, using predicate list for example. If you find that your ifs and elses are far apart syntactically, you might want to break that down into smaller functions. A conditional block that spans more than a screenful of text is hard to read.
There's no hard and fast rule that applies to every language. Something like having a single return statement won't make your code good. But good code will tend to allow you to write your functions that way.
I've seen it in coding standards for C++ that were a hang-over from C, as if you don't have RAII or other automatic memory management then you have to clean up for each return, which either means cut-and-paste of the clean-up or a goto (logically the same as 'finally' in managed languages), both of which are considered bad form. If your practices are to use smart pointers and collections in C++ or another automatic memory system, then there isn't a strong reason for it, and it become all about readability, and more of a judgement call.
I lean to the idea that return statements in the middle of the function are bad. You can use returns to build a few guard clauses at the top of the function, and of course tell the compiler what to return at the end of the function without issue, but returns in the middle of the function can be easy to miss and can make the function harder to interpret.
Are there good reasons why it's a better practice to have only one return statement in a function?
Yes, there are:
The single exit point gives an excellent place to assert your post-conditions.
Being able to put a debugger breakpoint on the one return at the end of the function is often useful.
Fewer returns means less complexity. Linear code is generally simpler to understand.
If trying to simplify a function to a single return causes complexity, then that's incentive to refactor to smaller, more general, easier-to-understand functions.
If you're in a language without destructors or if you don't use RAII, then a single return reduces the number of places you have to clean up.
Some languages require a single exit point (e.g., Pascal and Eiffel).
The question is often posed as a false dichotomy between multiple returns or deeply nested if statements. There's almost always a third solution which is very linear (no deep nesting) with only a single exit point.
Update: Apparently MISRA guidelines promote single exit, too.
To be clear, I'm not saying it's always wrong to have multiple returns. But given otherwise equivalent solutions, there are lots of good reasons to prefer the one with a single return.
Having a single exit point does provide an advantage in debugging, because it allows you to set a single breakpoint at the end of a function to see what value is actually going to be returned.
In general I try to have only a single exit point from a function. There are times, however, that doing so actually ends up creating a more complex function body than is necessary, in which case it's better to have multiple exit points. It really has to be a "judgement call" based on the resulting complexity, but the goal should be as few exit points as possible without sacrificing complexity and understandability.
No, because we don't live in the 1970s any more. If your function is long enough that multiple returns are a problem, it's too long.
(Quite apart from the fact that any multi-line function in a language with exceptions will have multiple exit points anyway.)
My preference would be for single exit unless it really complicates things. I have found that in some cases, multiple exist points can mask other more significant design problems:
public void DoStuff(Foo foo)
{
if (foo == null) return;
}
On seeing this code, I would immediately ask:
Is 'foo' ever null?
If so, how many clients of 'DoStuff' ever call the function with a null 'foo'?
Depending on the answers to these questions it might be that
the check is pointless as it never is true (ie. it should be an assertion)
the check is very rarely true and so it may be better to change those specific caller functions as they should probably take some other action anyway.
In both of the above cases the code can probably be reworked with an assertion to ensure that 'foo' is never null and the relevant callers changed.
There are two other reasons (specific I think to C++ code) where multiple exists can actually have a negative affect. They are code size, and compiler optimizations.
A non-POD C++ object in scope at the exit of a function will have its destructor called. Where there are several return statements, it may be the case that there are different objects in scope and so the list of destructors to call will be different. The compiler therefore needs to generate code for each return statement:
void foo (int i, int j) {
A a;
if (i > 0) {
B b;
return ; // Call dtor for 'b' followed by 'a'
}
if (i == j) {
C c;
B b;
return ; // Call dtor for 'b', 'c' and then 'a'
}
return 'a' // Call dtor for 'a'
}
If code size is an issue - then this may be something worth avoiding.
The other issue relates to "Named Return Value OptimiZation" (aka Copy Elision, ISO C++ '03 12.8/15). C++ allows an implementation to skip calling the copy constructor if it can:
A foo () {
A a1;
// do something
return a1;
}
void bar () {
A a2 ( foo() );
}
Just taking the code as is, the object 'a1' is constructed in 'foo' and then its copy construct will be called to construct 'a2'. However, copy elision allows the compiler to construct 'a1' in the same place on the stack as 'a2'. There is therefore no need to "copy" the object when the function returns.
Multiple exit points complicates the work of the compiler in trying to detect this, and at least for a relatively recent version of VC++ the optimization did not take place where the function body had multiple returns. See Named Return Value Optimization in Visual C++ 2005 for more details.
Having a single exit point reduces Cyclomatic Complexity and therefore, in theory, reduces the probability that you will introduce bugs into your code when you change it. Practice however, tends to suggest that a more pragmatic approach is needed. I therefore tend to aim to have a single exit point, but allow my code to have several if that is more readable.
I force myself to use only one return statement, as it will in a sense generate code smell. Let me explain:
function isCorrect($param1, $param2, $param3) {
$toret = false;
if ($param1 != $param2) {
if ($param1 == ($param3 * 2)) {
if ($param2 == ($param3 / 3)) {
$toret = true;
} else {
$error = 'Error 3';
}
} else {
$error = 'Error 2';
}
} else {
$error = 'Error 1';
}
return $toret;
}
(The conditions are arbritary...)
The more conditions, the larger the function gets, the more difficult it is to read. So if you're attuned to the code smell, you'll realise it, and want to refactor the code. Two possible solutions are:
Multiple returns
Refactoring into separate functions
Multiple Returns
function isCorrect($param1, $param2, $param3) {
if ($param1 == $param2) { $error = 'Error 1'; return false; }
if ($param1 != ($param3 * 2)) { $error = 'Error 2'; return false; }
if ($param2 != ($param3 / 3)) { $error = 'Error 3'; return false; }
return true;
}
Separate Functions
function isEqual($param1, $param2) {
return $param1 == $param2;
}
function isDouble($param1, $param2) {
return $param1 == ($param2 * 2);
}
function isThird($param1, $param2) {
return $param1 == ($param2 / 3);
}
function isCorrect($param1, $param2, $param3) {
return !isEqual($param1, $param2)
&& isDouble($param1, $param3)
&& isThird($param2, $param3);
}
Granted, it is longer and a bit messy, but in the process of refactoring the function this way, we've
created a number of reusable functions,
made the function more human readable, and
the focus of the functions is on why the values are correct.
I would say you should have as many as required, or any that make the code cleaner (such as guard clauses).
I have personally never heard/seen any "best practices" say that you should have only one return statement.
For the most part, I tend to exit a function as soon as possible based on a logic path (guard clauses are an excellent example of this).
I believe that multiple returns are usually good (in the code that I write in C#). The single-return style is a holdover from C. But you probably aren't coding in C.
There is no law requiring only one exit point for a method in all programming languages. Some people insist on the superiority of this style, and sometimes they elevate it to a "rule" or "law" but this belief is not backed up by any evidence or research.
More than one return style may be a bad habit in C code, where resources have to be explicitly de-allocated, but languages such as Java, C#, Python or JavaScript that have constructs such as automatic garbage collection and try..finally blocks (and using blocks in C#), and this argument does not apply - in these languages, it is very uncommon to need centralised manual resource deallocation.
There are cases where a single return is more readable, and cases where it isn't. See if it reduces the number of lines of code, makes the logic clearer or reduces the number of braces and indents or temporary variables.
Therefore, use as many returns as suits your artistic sensibilities, because it is a layout and readability issue, not a technical one.
I have talked about this at greater length on my blog.
There are good things to say about having a single exit-point, just as there are bad things to say about the inevitable "arrow" programming that results.
If using multiple exit points during input validation or resource allocation, I try to put all the 'error-exits' very visibly at the top of the function.
Both the Spartan Programming article of the "SSDSLPedia" and the single function exit point article of the "Portland Pattern Repository's Wiki" have some insightful arguments around this. Also, of course, there is this post to consider.
If you really want a single exit-point (in any non-exception-enabled language) for example in order to release resources in one single place, I find the careful application of goto to be good; see for example this rather contrived example (compressed to save screen real-estate):
int f(int y) {
int value = -1;
void *data = NULL;
if (y < 0)
goto clean;
if ((data = malloc(123)) == NULL)
goto clean;
/* More code */
value = 1;
clean:
free(data);
return value;
}
Personally I, in general, dislike arrow programming more than I dislike multiple exit-points, although both are useful when applied correctly. The best, of course, is to structure your program to require neither. Breaking down your function into multiple chunks usually help :)
Although when doing so, I find I end up with multiple exit points anyway as in this example, where some larger function has been broken down into several smaller functions:
int g(int y) {
value = 0;
if ((value = g0(y, value)) == -1)
return -1;
if ((value = g1(y, value)) == -1)
return -1;
return g2(y, value);
}
Depending on the project or coding guidelines, most of the boiler-plate code could be replaced by macros. As a side note, breaking it down this way makes the functions g0, g1 ,g2 very easy to test individually.
Obviously, in an OO and exception-enabled language, I wouldn't use if-statements like that (or at all, if I could get away with it with little enough effort), and the code would be much more plain. And non-arrowy. And most of the non-final returns would probably be exceptions.
In short;
Few returns are better than many returns
More than one return is better than huge arrows, and guard clauses are generally ok.
Exceptions could/should probably replace most 'guard clauses' when possible.
You know the adage - beauty is in the eyes of the beholder.
Some people swear by NetBeans and some by IntelliJ IDEA, some by Python and some by PHP.
In some shops you could lose your job if you insist on doing this:
public void hello()
{
if (....)
{
....
}
}
The question is all about visibility and maintainability.
I am addicted to using boolean algebra to reduce and simplify logic and use of state machines. However, there were past colleagues who believed my employ of "mathematical techniques" in coding is unsuitable, because it would not be visible and maintainable. And that would be a bad practice. Sorry people, the techniques I employ is very visible and maintainable to me - because when I return to the code six months later, I would understand the code clearly rather seeing a mess of proverbial spaghetti.
Hey buddy (like a former client used to say) do what you want as long as you know how to fix it when I need you to fix it.
I remember 20 years ago, a colleague of mine was fired for employing what today would be called agile development strategy. He had a meticulous incremental plan. But his manager was yelling at him "You can't incrementally release features to users! You must stick with the waterfall." His response to the manager was that incremental development would be more precise to customer's needs. He believed in developing for the customers needs, but the manager believed in coding to "customer's requirement".
We are frequently guilty for breaking data normalization, MVP and MVC boundaries. We inline instead of constructing a function. We take shortcuts.
Personally, I believe that PHP is bad practice, but what do I know. All the theoretical arguments boils down to trying fulfill one set of rules
quality = precision, maintainability
and profitability.
All other rules fade into the background. And of course this rule never fades:
Laziness is the virtue of a good
programmer.
I lean towards using guard clauses to return early and otherwise exit at the end of a method. The single entry and exit rule has historical significance and was particularly helpful when dealing with legacy code that ran to 10 A4 pages for a single C++ method with multiple returns (and many defects). More recently, accepted good practice is to keep methods small which makes multiple exits less of an impedance to understanding. In the following Kronoz example copied from above, the question is what occurs in //Rest of code...?:
void string fooBar(string s, int? i) {
if(string.IsNullOrEmpty(s) || i == null) return null;
var res = someFunction(s, i);
foreach(var r in res) {
if(!r.Passed) return null;
}
// Rest of code...
return ret;
}
I realise the example is somewhat contrived but I would be tempted to refactor the foreach loop into a LINQ statement that could then be considered a guard clause. Again, in a contrived example the intent of the code isn't apparent and someFunction() may have some other side effect or the result may be used in the // Rest of code....
if (string.IsNullOrEmpty(s) || i == null) return null;
if (someFunction(s, i).Any(r => !r.Passed)) return null;
Giving the following refactored function:
void string fooBar(string s, int? i) {
if (string.IsNullOrEmpty(s) || i == null) return null;
if (someFunction(s, i).Any(r => !r.Passed)) return null;
// Rest of code...
return ret;
}
One good reason I can think of is for code maintenance: you have a single point of exit. If you want to change the format of the result,..., it's just much simpler to implement. Also, for debugging, you can just stick a breakpoint there :)
Having said that, I once had to work in a library where the coding standards imposed 'one return statement per function', and I found it pretty tough. I write lots of numerical computations code, and there often are 'special cases', so the code ended up being quite hard to follow...
Multiple exit points are fine for small enough functions -- that is, a function that can be viewed on one screen length on its entirety. If a lengthy function likewise includes multiple exit points, it's a sign that the function can be chopped up further.
That said I avoid multiple-exit functions unless absolutely necessary. I have felt pain of bugs that are due to some stray return in some obscure line in more complex functions.
I've worked with terrible coding standards that forced a single exit path on you and the result is nearly always unstructured spaghetti if the function is anything but trivial -- you end up with lots of breaks and continues that just get in the way.
Single exit point - all other things equal - makes code significantly more readable.
But there's a catch: popular construction
resulttype res;
if if if...
return res;
is a fake, "res=" is not much better than "return". It has single return statement, but multiple points where function actually ends.
If you have function with multiple returns (or "res="s), it's often a good idea to break it into several smaller functions with single exit point.
My usual policy is to have only one return statement at the end of a function unless the complexity of the code is greatly reduced by adding more. In fact, I'm rather a fan of Eiffel, which enforces the only one return rule by having no return statement (there's just a auto-created 'result' variable to put your result in).
There certainly are cases where code can be made clearer with multiple returns than the obvious version without them would be. One could argue that more rework is needed if you have a function that is too complex to be understandable without multiple return statements, but sometimes it's good to be pragmatic about such things.
If you end up with more than a few returns there may be something wrong with your code. Otherwise I would agree that sometimes it is nice to be able to return from multiple places in a subroutine, especially when it make the code cleaner.
Perl 6: Bad Example
sub Int_to_String( Int i ){
given( i ){
when 0 { return "zero" }
when 1 { return "one" }
when 2 { return "two" }
when 3 { return "three" }
when 4 { return "four" }
...
default { return undef }
}
}
would be better written like this
Perl 6: Good Example
#Int_to_String = qw{
zero
one
two
three
four
...
}
sub Int_to_String( Int i ){
return undef if i < 0;
return undef unless i < #Int_to_String.length;
return #Int_to_String[i]
}
Note this is was just a quick example
I vote for Single return at the end as a guideline. This helps a common code clean-up handling ... For example, take a look at the following code ...
void ProcessMyFile (char *szFileName)
{
FILE *fp = NULL;
char *pbyBuffer = NULL:
do {
fp = fopen (szFileName, "r");
if (NULL == fp) {
break;
}
pbyBuffer = malloc (__SOME__SIZE___);
if (NULL == pbyBuffer) {
break;
}
/*** Do some processing with file ***/
} while (0);
if (pbyBuffer) {
free (pbyBuffer);
}
if (fp) {
fclose (fp);
}
}
This is probably an unusual perspective, but I think that anyone who believes that multiple return statements are to be favoured has never had to use a debugger on a microprocessor that supports only 4 hardware breakpoints. ;-)
While the issues of "arrow code" are completely correct, one issue that seems to go away when using multiple return statements is in the situation where you are using a debugger. You have no convenient catch-all position to put a breakpoint to guarantee that you're going to see the exit and hence the return condition.
The more return statements you have in a function, the higher complexity in that one method. If you find yourself wondering if you have too many return statements, you might want to ask yourself if you have too many lines of code in that function.
But, not, there is nothing wrong with one/many return statements. In some languages, it is a better practice (C++) than in others (C).