Does the construct do .. while(false) contribute to better control flow? - language-agnostic

I've recently come across this code:
do {
if ( ! checkSomething() )
break;
// some code
if ( ! checkSomeOtherThing() )
break;
// some other code
} while(false);
// some final code
The programmer that wrote it, wrote a comment along the lines of "cleaner control flow".
In my opinion, the original code could look better if its refactored into something else. But is there any truth in this statement ? Is this construct any good ?

I find this much easier to read, and it produces an identical result:
if ( checkSomething() )
{
// some code
if ( checkSomeOtherThing() )
{
// some other code
}
}
// some final code
I think do ... while is normally hard to follow, but using it for something other than a loop is misleading at best.

This is equivalent to a goto.
In such situations, it is better to use a goto than to use an ugly hack.
Changing it to use a goto makes it much more readable:
if (!checkSomething())
goto Done;
// some code
if (!checkSomeOtherThing())
goto Done;
// some other code
Done: //some final code

If you don't mind loops containing several break statements, then the only problem here is that C (for obvious reasons) doesn't let you break out of a bare block, hence the "non-loop" which some unsuspecting future maintainer could mistake for a real loop.
The considerations, I think, are:
if there are only two break points, what's so bad about two if statements?
if there are more than two break points then the indentation with if statements could get unpleasant, and this saves that, but then again is the function doing too much? And even if not, would it be better just to use goto and avoid the weirdness of a loop that doesn't loop?
Since you tag this language-agnostic, I used to use a macroised assembly language, with a block ... endblock that you could break out of. This lead to reasonably nice code for checking necessary conditions, such as:
block
breakif str1 == null
breakif str2 == null
get some combined property of str1 and str2
breakif some other condition that stops us getting on with it
get on with it
endblock
Actually, it wasn't breakif str1 == null, it was breakifeq.p str1, null, or something like that, but I forget exactly what.

I've seen the do-while form adopted as a standard to which coders conformed. The advantage is that it communicates, and implements, that the loop will always be executed at least once. This helps isolate, with consistency, the situations where something different occurs, i.e. where the code in the loop is not executed.
This standard was adopted because the Warnier-Orr technique was being applied.

Related

Should I avoid do/while and favour while? [duplicate]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
When I was taking CS in college (mid 80's), one of the ideas that was constantly repeated was to always write loops which test at the top (while...) rather than at the bottom (do ... while) of the loop. These notions were often backed up with references to studies which showed that loops which tested at the top were statistically much more likely to be correct than their bottom-testing counterparts.
As a result, I almost always write loops which test at the top. I don't do it if it introduces extra complexity in the code, but that case seems rare. I notice that some programmers tend to almost exclusively write loops that test at the bottom. When I see constructs like:
if (condition)
{
do
{
...
} while (same condition);
}
or the inverse (if inside the while), it makes me wonder if they actually wrote it that way or if they added the if statement when they realized the loop didn't handle the null case.
I've done some googling, but haven't been able to find any literature on this subject. How do you guys (and gals) write your loops?
I always follow the rule that if it should run zero or more times, test at the beginning, if it must run once or more, test at the end. I do not see any logical reason to use the code you listed in your example. It only adds complexity.
Use while loops when you want to test a condition before the first iteration of the loop.
Use do-while loops when you want to test a condition after running the first iteration of the loop.
For example, if you find yourself doing something like either of these snippets:
func();
while (condition) {
func();
}
//or:
while (true){
func();
if (!condition) break;
}
You should rewrite it as:
do{
func();
} while(condition);
Difference is that the do loop executes "do something" once and then checks the condition to see if it should repeat the "do something" while the while loop checks the condition before doing anything
Does avoiding do/while really help make my code more readable?
No.
If it makes more sense to use a do/while loop, then do so. If you need to execute the body of a loop once before testing the condition, then a do/while loop is probably the most straightforward implementation.
First one may not execute at all if condition is false. Other one will execute at least once, then check the conidition.
For the sake of readability it seems sensible to test at the top. The fact it is a loop is important; the person reading the code should be aware of the loop conditions before trying to comprehend the body of the loop.
Here's a good real-world example I came across recently. Suppose you have a number of processing tasks (like processing elements in an array) and you wish to split the work between one thread per CPU core present. There must be at least one core to be running the current code! So you can use a do... while something like:
do {
get_tasks_for_core();
launch_thread();
} while (cores_remaining());
It's almost negligable, but it might be worth considering the performance benefit: it could equally be written as a standard while loop, but that would always make an unnecessary initial comparison that would always evaluate true - and on single-core, the do-while condition branches more predictably (always false, versus alternating true/false for a standard while).
Yaa..its true.. do while will run atleast one time.
Thats the only difference. Nothing else to debate on this
The first tests the condition before performing so it's possible your code won't ever enter the code underneath. The second will perform the code within before testing the condition.
The while loop will check "condition" first; if it's false, it will never "do something." But the do...while loop will "do something" first, then check "condition".
Yes, just like using for instead of while, or foreach instead of for improves readability. That said some circumstances need do while and I agree you would be silly to force those situations into a while loop.
It's more helpful to think in terms of common usage. The vast majority of while loops work quite naturally with while, even if they could be made to work with do...while, so basically you should use it when the difference doesn't matter. I would thus use do...while for the rare scenarios where it provides a noticeable improvement in readability.
The use cases are different for the two. This isn't a "best practices" question.
If you want a loop to execute based on the condition exclusively than use
for or while
If you want to do something once regardless of the the condition and then continue doing it based the condition evaluation.
do..while
For anyone who can't think of a reason to have a one-or-more times loop:
try {
someOperation();
} catch (Exception e) {
do {
if (e instanceof ExceptionIHandleInAWierdWay) {
HandleWierdException((ExceptionIHandleInAWierdWay)e);
}
} while ((e = e.getInnerException())!= null);
}
The same could be used for any sort of hierarchical structure.
in class Node:
public Node findSelfOrParentWithText(string text) {
Node node = this;
do {
if(node.containsText(text)) {
break;
}
} while((node = node.getParent()) != null);
return node;
}
A while() checks the condition before each execution of the loop body and a do...while() checks the condition after each execution of the loop body.
Thus, **do...while()**s will always execute the loop body at least once.
Functionally, a while() is equivalent to
startOfLoop:
if (!condition)
goto endOfLoop;
//loop body goes here
goto startOfLoop;
endOfLoop:
and a do...while() is equivalent to
startOfLoop:
//loop body
//goes here
if (condition)
goto startOfLoop;
Note that the implementation is probably more efficient than this. However, a do...while() does involve one less comparison than a while() so it is slightly faster. Use a do...while() if:
you know that the condition will always be true the first time around, or
you want the loop to execute once even if the condition is false to begin with.
Here is the translation:
do { y; } while(x);
Same as
{ y; } while(x) { y; }
Note the extra set of braces are for the case you have variable definitions in y. The scope of those must be kept local like in the do-loop case. So, a do-while loop just executes its body at least once. Apart from that, the two loops are identical. So if we apply this rule to your code
do {
// do something
} while (condition is true);
The corresponding while loop for your do-loop looks like
{
// do something
}
while (condition is true) {
// do something
}
Yes, you see the corresponding while for your do loop differs from your while :)
As noted by Piemasons, the difference is whether the loop executes once before doing the test, or if the test is done first so that the body of the loop might never execute.
The key question is which makes sense for your application.
To take two simple examples:
Say you're looping through the elements of an array. If the array has no elements, you don't want to process number one of zero. So you should use WHILE.
You want to display a message, accept a response, and if the response is invalid, ask again until you get a valid response. So you always want to ask once. You can't test if the response is valid until you get a response, so you have to go through the body of the loop once before you can test the condition. You should use DO/WHILE.
I tend to prefer do-while loops, myself. If the condition will always be true at the start of the loop, I prefer to test it at the end. To my eye, the whole point of testing conditions (other than assertions) is that one doesn't know the result of the test. If I see a while loop with the condition test at the top, my inclination is to consider the case that the loop executes zero times. If that can never happen, why not code in a way that clearly shows that?
It's actually meant for a different things. In C, you can use do - while construct to achieve both scenario (runs at least once and runs while true). But PASCAL has repeat - until and while for each scenario, and if I remember correctly, ADA has another construct that lets you quit in the middle, but of course that's not what you're asking.
My answer to your question : I like my loop with testing on top.
Both conventions are correct if you know how to write the code correctly :)
Usually the use of second convention ( do {} while() ) is meant to avoid have a duplicated statement outside the loop. Consider the following (over simplified) example:
a++;
while (a < n) {
a++;
}
can be written more concisely using
do {
a++;
} while (a < n)
Of course, this particular example can be written in an even more concise way as (assuming C syntax)
while (++a < n) {}
But I think you can see the point here.
while( someConditionMayBeFalse ){
// this will never run...
}
// then the alternative
do{
// this will run once even if the condition is false
while( someConditionMayBeFalse );
The difference is obvious and allows you to have code run and then evaluate the result to see if you have to "Do it again" and the other method of while allows you to have a block of script ignored if the conditional is not met.
I write mine pretty much exclusively testing at the top. It's less code, so for me at least, it's less potential to screw something up (e.g., copy-pasting the condition makes two places you always have to update it)
It really depends there are situations when you want to test at the top, others when you want to test at the bottom, and still others when you want to test in the middle.
However the example given seems absurd. If you are going to test at the top, don't use an if statement and test at the bottom, just use a while statement, that's what it is made for.
You should first think of the test as part of the loop code. If the test logically belongs at the start of the loop processing, then it's a top-of-the-loop test. If the test logically belongs at the end of the loop (i.e. it decides if the loop should continue to run), then it's probably a bottom-of-the-loop test.
You will have to do something fancy if the test logically belongs in them middle. :-)
I guess some people test at the bottom because you could save one or a few machine cycles by doing that 30 years ago.
To write code that is correct, one basically needs to perform a mental, perhaps informal proof of correctness.
To prove a loop correct, the standard way is to choose a loop invariant, and an induction proof. But skip the complicated words: what you do, informally, is figure out something that is true of each iteration of the loop, and that when the loop is done, what you wanted accomplished is now true. The loop invariant is false at the end, for the loop to terminate.
If the loop conditions map fairly easily to the invariant, and the invariant is at the top of the loop, and one infers that the invariant is true at the next iteration of the loop by working through the code of the loop, then it is easy to figure out that the loop is correct.
However, if the invariant is at the bottom of the loop, then unless you have an assertion just prior to the loop (a good practice) then it becomes more difficult because you have to essentially infer what that invariant should be, and that any code that ran before the loop makes the loop invariant true (since there is no loop precondition, code will execute in the loop). It just becomes that more difficult to prove correct, even if it is an informal in-your-head proof.
This isn't really an answer but a reiteration of something one of my lecturers said and it interested me at the time.
The two types of loop while..do and do..while are actually instances of a third more generic loop, which has the test somewhere in the middle.
begin loop
<Code block A>
loop condition
<Code block B>
end loop
Code block A is executed at least once and B is executed zero or more times, but isn't run on the very last (failing) iteration. a while loop is when code block a is empty and a do..while is when code block b is empty. But if you're writing a compiler, you might be interested in generalizing both cases to a loop like this.
In a typical Discrete Structures class in computer science, it's an easy proof that there is an equivalence mapping between the two.
Stylistically, I prefer while (easy-expr) { } when easy-expr is known up front and ready to go, and the loop doesn't have a lot of repeated overhead/initialization. I prefer do { } while (somewhat-less-easy-expr); when there is more repeated overhead and the condition may not be quite so simple to set up ahead of time. If I write an infinite loop, I always use while (true) { }. I can't explain why, but I just don't like writing for (;;) { }.
I would say it is bad practice to write if..do..while loops, for the simple reason that this increases the size of the code and causes code duplications. Code duplications are error prone and should be avoided, as any change to one part must be performed on the duplicate as well, which isn't always the case. Also, bigger code means a harder time on the cpu cache. Finally, it handles null cases, and solves head aches.
Only when the first loop is fundamentally different should one use do..while, say, if the code that makes you pass the loop condition (like initialization) is performed in the loop. Otherwise, if it certain that loop will never fall on the first iteration, then yes, a do..while is appropriate.
From my limited knowledge of code generation I think it may be a good idea to write bottom test loops since they enable the compiler to perform loop optimizations better. For bottom test loops it is guaranteed that the loop executes at least once. This means loop invariant code "dominates" the exit node. And thus can be safely moved just before the loop starts.

Successive success checks

Most of you have probably bumped into a situation, where multiple things must be in check and in certain order before the application can proceed, for example in a very simple case of creating a listening socket (socket, bind, listen, accept etc.). There are at least two obvious ways (don't take this 100% verbatim):
if (1st_ok)
{
if (2nd_ok)
{
...
or
if (!1st_ok)
{
return;
}
if (!2nd_ok)
{
return;
}
...
Have you ever though of anything smarter, do you prefer one over the other of the above, or do you (if the language provides for it) use exceptions?
I prefer the second technique. The main problem with the first one is that it increases the nesting depth of the code, which is a significant issue when you've got a substantial number of preconditions/resource-allocs to check since the business part of the function ends up deeply buried behind a wall of conditions (and frequently loops too). In the second case, you can simplify the conceptual logic to "we've got here and everything's OK", which is much easier to work with. Keeping the normal case as straight-line as possible is just easier to grok, especially when doing maintenance coding.
It depends on the language - e.g. in C++ you might well use exceptions, while in C you might use one of several strategies:
if/else blocks
goto (one of the few cases where a single goto label for "exception" handling might be justified
use break within a do { ... } while (0) loop
Personally I don't like multiple return statements in a function - I prefer to have a common clean up block at the end of the function followed by a single return statement.
This tends to be a matter of style. Some people only like returning at the end of a procedure, others prefer to do it wherever needed.
I'm a fan of the second method, as it allows for clean and concise code as well as ease of adding documentation on what it's doing.
// Checking for llama integration
if (!1st_ok)
{
return;
}
// Llama found, loading spitting capacity
if (!2nd_ok)
{
return;
}
// Etc.
I prefer the second version.
In the normal case, all code between the checks executes sequentially, so I like to see them at the same level. Normally none of the if branches are executed, so I want them to be as unobtrusive as possible.
I use 2nd because I think It reads better and easier to follow the logic. Also they say exceptions should not be used for flow control, but for the exceptional and unexpected cases. Id like to see what pros say about this.
What about
if (1st_ok && 2nd_ok) { }
or if some work must be done, like in your example with sockets
if (1st_ok() && 2nd_ok()) { }
I avoid the first solution because of nesting.
I avoid the second solution because of corporate coding rules which forbid multiple return in a function body.
Of course coding rules also forbid goto.
My workaround is to use a local variable:
bool isFailed = false; // or whatever is available for bool/true/false
if (!check1) {
log_error();
try_recovery_action();
isFailed = true;
}
if (!isfailed) {
if (!check2) {
log_error();
try_recovery_action();
isFailed = true;
}
}
...
This is not as beautiful as I would like but it is the best I've found to conform to my constraints and to write a readable code.
For what it is worth, here are some of my thoughts and experiences on this question.
Personally, I tend to prefer the second case you outlined. I find it easier to follow (and debug) the code. That is, as the code progresses, it becomes "more correct". In my own experience, this has seemed to be the preferred method.
I don't know how common it is in the field, but I've also seen condition testing written as ...
error = foo1 ();
if ((error == OK) && test1)) {
error = foo2 ();
}
if ((error == OK) && (test2)) {
error = foo3 ();
}
...
return (error);
Although readable (always a plus in my books) and avoiding deep nesting, it always struck me as using a lot of unnecessary testing to achieve those ends.
The first method, I see used less frequently than the second. Of those times, the vast majority of the time was because there was no nice way around it. For the remaining few instances, it was justified on the basis of extracting a little more performance on the success case. The argument was that the processor would predict a forward branch as not taken (corresponding to the else clause). This depended upon several factors including, the architecture, compiler, language, need, .... Obviously most projects (and most aspects of the project) did not meet those requirements.
Hope this helps.

Test loops at the top or bottom? (while vs. do while) [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
When I was taking CS in college (mid 80's), one of the ideas that was constantly repeated was to always write loops which test at the top (while...) rather than at the bottom (do ... while) of the loop. These notions were often backed up with references to studies which showed that loops which tested at the top were statistically much more likely to be correct than their bottom-testing counterparts.
As a result, I almost always write loops which test at the top. I don't do it if it introduces extra complexity in the code, but that case seems rare. I notice that some programmers tend to almost exclusively write loops that test at the bottom. When I see constructs like:
if (condition)
{
do
{
...
} while (same condition);
}
or the inverse (if inside the while), it makes me wonder if they actually wrote it that way or if they added the if statement when they realized the loop didn't handle the null case.
I've done some googling, but haven't been able to find any literature on this subject. How do you guys (and gals) write your loops?
I always follow the rule that if it should run zero or more times, test at the beginning, if it must run once or more, test at the end. I do not see any logical reason to use the code you listed in your example. It only adds complexity.
Use while loops when you want to test a condition before the first iteration of the loop.
Use do-while loops when you want to test a condition after running the first iteration of the loop.
For example, if you find yourself doing something like either of these snippets:
func();
while (condition) {
func();
}
//or:
while (true){
func();
if (!condition) break;
}
You should rewrite it as:
do{
func();
} while(condition);
Difference is that the do loop executes "do something" once and then checks the condition to see if it should repeat the "do something" while the while loop checks the condition before doing anything
Does avoiding do/while really help make my code more readable?
No.
If it makes more sense to use a do/while loop, then do so. If you need to execute the body of a loop once before testing the condition, then a do/while loop is probably the most straightforward implementation.
First one may not execute at all if condition is false. Other one will execute at least once, then check the conidition.
For the sake of readability it seems sensible to test at the top. The fact it is a loop is important; the person reading the code should be aware of the loop conditions before trying to comprehend the body of the loop.
Here's a good real-world example I came across recently. Suppose you have a number of processing tasks (like processing elements in an array) and you wish to split the work between one thread per CPU core present. There must be at least one core to be running the current code! So you can use a do... while something like:
do {
get_tasks_for_core();
launch_thread();
} while (cores_remaining());
It's almost negligable, but it might be worth considering the performance benefit: it could equally be written as a standard while loop, but that would always make an unnecessary initial comparison that would always evaluate true - and on single-core, the do-while condition branches more predictably (always false, versus alternating true/false for a standard while).
Yaa..its true.. do while will run atleast one time.
Thats the only difference. Nothing else to debate on this
The first tests the condition before performing so it's possible your code won't ever enter the code underneath. The second will perform the code within before testing the condition.
The while loop will check "condition" first; if it's false, it will never "do something." But the do...while loop will "do something" first, then check "condition".
Yes, just like using for instead of while, or foreach instead of for improves readability. That said some circumstances need do while and I agree you would be silly to force those situations into a while loop.
It's more helpful to think in terms of common usage. The vast majority of while loops work quite naturally with while, even if they could be made to work with do...while, so basically you should use it when the difference doesn't matter. I would thus use do...while for the rare scenarios where it provides a noticeable improvement in readability.
The use cases are different for the two. This isn't a "best practices" question.
If you want a loop to execute based on the condition exclusively than use
for or while
If you want to do something once regardless of the the condition and then continue doing it based the condition evaluation.
do..while
For anyone who can't think of a reason to have a one-or-more times loop:
try {
someOperation();
} catch (Exception e) {
do {
if (e instanceof ExceptionIHandleInAWierdWay) {
HandleWierdException((ExceptionIHandleInAWierdWay)e);
}
} while ((e = e.getInnerException())!= null);
}
The same could be used for any sort of hierarchical structure.
in class Node:
public Node findSelfOrParentWithText(string text) {
Node node = this;
do {
if(node.containsText(text)) {
break;
}
} while((node = node.getParent()) != null);
return node;
}
A while() checks the condition before each execution of the loop body and a do...while() checks the condition after each execution of the loop body.
Thus, **do...while()**s will always execute the loop body at least once.
Functionally, a while() is equivalent to
startOfLoop:
if (!condition)
goto endOfLoop;
//loop body goes here
goto startOfLoop;
endOfLoop:
and a do...while() is equivalent to
startOfLoop:
//loop body
//goes here
if (condition)
goto startOfLoop;
Note that the implementation is probably more efficient than this. However, a do...while() does involve one less comparison than a while() so it is slightly faster. Use a do...while() if:
you know that the condition will always be true the first time around, or
you want the loop to execute once even if the condition is false to begin with.
Here is the translation:
do { y; } while(x);
Same as
{ y; } while(x) { y; }
Note the extra set of braces are for the case you have variable definitions in y. The scope of those must be kept local like in the do-loop case. So, a do-while loop just executes its body at least once. Apart from that, the two loops are identical. So if we apply this rule to your code
do {
// do something
} while (condition is true);
The corresponding while loop for your do-loop looks like
{
// do something
}
while (condition is true) {
// do something
}
Yes, you see the corresponding while for your do loop differs from your while :)
As noted by Piemasons, the difference is whether the loop executes once before doing the test, or if the test is done first so that the body of the loop might never execute.
The key question is which makes sense for your application.
To take two simple examples:
Say you're looping through the elements of an array. If the array has no elements, you don't want to process number one of zero. So you should use WHILE.
You want to display a message, accept a response, and if the response is invalid, ask again until you get a valid response. So you always want to ask once. You can't test if the response is valid until you get a response, so you have to go through the body of the loop once before you can test the condition. You should use DO/WHILE.
I tend to prefer do-while loops, myself. If the condition will always be true at the start of the loop, I prefer to test it at the end. To my eye, the whole point of testing conditions (other than assertions) is that one doesn't know the result of the test. If I see a while loop with the condition test at the top, my inclination is to consider the case that the loop executes zero times. If that can never happen, why not code in a way that clearly shows that?
It's actually meant for a different things. In C, you can use do - while construct to achieve both scenario (runs at least once and runs while true). But PASCAL has repeat - until and while for each scenario, and if I remember correctly, ADA has another construct that lets you quit in the middle, but of course that's not what you're asking.
My answer to your question : I like my loop with testing on top.
Both conventions are correct if you know how to write the code correctly :)
Usually the use of second convention ( do {} while() ) is meant to avoid have a duplicated statement outside the loop. Consider the following (over simplified) example:
a++;
while (a < n) {
a++;
}
can be written more concisely using
do {
a++;
} while (a < n)
Of course, this particular example can be written in an even more concise way as (assuming C syntax)
while (++a < n) {}
But I think you can see the point here.
while( someConditionMayBeFalse ){
// this will never run...
}
// then the alternative
do{
// this will run once even if the condition is false
while( someConditionMayBeFalse );
The difference is obvious and allows you to have code run and then evaluate the result to see if you have to "Do it again" and the other method of while allows you to have a block of script ignored if the conditional is not met.
I write mine pretty much exclusively testing at the top. It's less code, so for me at least, it's less potential to screw something up (e.g., copy-pasting the condition makes two places you always have to update it)
It really depends there are situations when you want to test at the top, others when you want to test at the bottom, and still others when you want to test in the middle.
However the example given seems absurd. If you are going to test at the top, don't use an if statement and test at the bottom, just use a while statement, that's what it is made for.
You should first think of the test as part of the loop code. If the test logically belongs at the start of the loop processing, then it's a top-of-the-loop test. If the test logically belongs at the end of the loop (i.e. it decides if the loop should continue to run), then it's probably a bottom-of-the-loop test.
You will have to do something fancy if the test logically belongs in them middle. :-)
I guess some people test at the bottom because you could save one or a few machine cycles by doing that 30 years ago.
To write code that is correct, one basically needs to perform a mental, perhaps informal proof of correctness.
To prove a loop correct, the standard way is to choose a loop invariant, and an induction proof. But skip the complicated words: what you do, informally, is figure out something that is true of each iteration of the loop, and that when the loop is done, what you wanted accomplished is now true. The loop invariant is false at the end, for the loop to terminate.
If the loop conditions map fairly easily to the invariant, and the invariant is at the top of the loop, and one infers that the invariant is true at the next iteration of the loop by working through the code of the loop, then it is easy to figure out that the loop is correct.
However, if the invariant is at the bottom of the loop, then unless you have an assertion just prior to the loop (a good practice) then it becomes more difficult because you have to essentially infer what that invariant should be, and that any code that ran before the loop makes the loop invariant true (since there is no loop precondition, code will execute in the loop). It just becomes that more difficult to prove correct, even if it is an informal in-your-head proof.
This isn't really an answer but a reiteration of something one of my lecturers said and it interested me at the time.
The two types of loop while..do and do..while are actually instances of a third more generic loop, which has the test somewhere in the middle.
begin loop
<Code block A>
loop condition
<Code block B>
end loop
Code block A is executed at least once and B is executed zero or more times, but isn't run on the very last (failing) iteration. a while loop is when code block a is empty and a do..while is when code block b is empty. But if you're writing a compiler, you might be interested in generalizing both cases to a loop like this.
In a typical Discrete Structures class in computer science, it's an easy proof that there is an equivalence mapping between the two.
Stylistically, I prefer while (easy-expr) { } when easy-expr is known up front and ready to go, and the loop doesn't have a lot of repeated overhead/initialization. I prefer do { } while (somewhat-less-easy-expr); when there is more repeated overhead and the condition may not be quite so simple to set up ahead of time. If I write an infinite loop, I always use while (true) { }. I can't explain why, but I just don't like writing for (;;) { }.
I would say it is bad practice to write if..do..while loops, for the simple reason that this increases the size of the code and causes code duplications. Code duplications are error prone and should be avoided, as any change to one part must be performed on the duplicate as well, which isn't always the case. Also, bigger code means a harder time on the cpu cache. Finally, it handles null cases, and solves head aches.
Only when the first loop is fundamentally different should one use do..while, say, if the code that makes you pass the loop condition (like initialization) is performed in the loop. Otherwise, if it certain that loop will never fall on the first iteration, then yes, a do..while is appropriate.
From my limited knowledge of code generation I think it may be a good idea to write bottom test loops since they enable the compiler to perform loop optimizations better. For bottom test loops it is guaranteed that the loop executes at least once. This means loop invariant code "dominates" the exit node. And thus can be safely moved just before the loop starts.

Should a function have only one return statement?

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Are there good reasons why it's a better practice to have only one return statement in a function?
Or is it okay to return from a function as soon as it is logically correct to do so, meaning there may be many return statements in the function?
I often have several statements at the start of a method to return for "easy" situations. For example, this:
public void DoStuff(Foo foo)
{
if (foo != null)
{
...
}
}
... can be made more readable (IMHO) like this:
public void DoStuff(Foo foo)
{
if (foo == null) return;
...
}
So yes, I think it's fine to have multiple "exit points" from a function/method.
Nobody has mentioned or quoted Code Complete so I'll do it.
17.1 return
Minimize the number of returns in each routine. It's harder to understand a routine if, reading it at the bottom, you're unaware of the possibility that it returned somewhere above.
Use a return when it enhances readability. In certain routines, once you know the answer, you want to return it to the calling routine immediately. If the routine is defined in such a way that it doesn't require any cleanup, not returning immediately means that you have to write more code.
I would say it would be incredibly unwise to decide arbitrarily against multiple exit points as I have found the technique to be useful in practice over and over again, in fact I have often refactored existing code to multiple exit points for clarity. We can compare the two approaches thus:-
string fooBar(string s, int? i) {
string ret = "";
if(!string.IsNullOrEmpty(s) && i != null) {
var res = someFunction(s, i);
bool passed = true;
foreach(var r in res) {
if(!r.Passed) {
passed = false;
break;
}
}
if(passed) {
// Rest of code...
}
}
return ret;
}
Compare this to the code where multiple exit points are permitted:-
string fooBar(string s, int? i) {
var ret = "";
if(string.IsNullOrEmpty(s) || i == null) return null;
var res = someFunction(s, i);
foreach(var r in res) {
if(!r.Passed) return null;
}
// Rest of code...
return ret;
}
I think the latter is considerably clearer. As far as I can tell the criticism of multiple exit points is a rather archaic point of view these days.
I currently am working on a codebase where two of the people working on it blindly subscribe to the "single point of exit" theory and I can tell you that from experience, it's a horrible horrible practice. It makes code extremely difficult to maintain and I'll show you why.
With the "single point of exit" theory, you inevitably wind up with code that looks like this:
function()
{
HRESULT error = S_OK;
if(SUCCEEDED(Operation1()))
{
if(SUCCEEDED(Operation2()))
{
if(SUCCEEDED(Operation3()))
{
if(SUCCEEDED(Operation4()))
{
}
else
{
error = OPERATION4FAILED;
}
}
else
{
error = OPERATION3FAILED;
}
}
else
{
error = OPERATION2FAILED;
}
}
else
{
error = OPERATION1FAILED;
}
return error;
}
Not only does this make the code very hard to follow, but now say later on you need to go back and add an operation in between 1 and 2. You have to indent just about the entire freaking function, and good luck making sure all of your if/else conditions and braces are matched up properly.
This method makes code maintenance extremely difficult and error prone.
Structured programming says you should only ever have one return statement per function. This is to limit the complexity. Many people such as Martin Fowler argue that it is simpler to write functions with multiple return statements. He presents this argument in the classic refactoring book he wrote. This works well if you follow his other advice and write small functions. I agree with this point of view and only strict structured programming purists adhere to single return statements per function.
As Kent Beck notes when discussing guard clauses in Implementation Patterns making a routine have a single entry and exit point ...
"was to prevent the confusion possible
when jumping into and out of many
locations in the same routine. It made
good sense when applied to FORTRAN or
assembly language programs written
with lots of global data where even
understanding which statements were
executed was hard work ... with small methods and mostly local data, it is needlessly conservative."
I find a function written with guard clauses much easier to follow than one long nested bunch of if then else statements.
In a function that has no side-effects, there's no good reason to have more than a single return and you should write them in a functional style. In a method with side-effects, things are more sequential (time-indexed), so you write in an imperative style, using the return statement as a command to stop executing.
In other words, when possible, favor this style
return a > 0 ?
positively(a):
negatively(a);
over this
if (a > 0)
return positively(a);
else
return negatively(a);
If you find yourself writing several layers of nested conditions, there's probably a way you can refactor that, using predicate list for example. If you find that your ifs and elses are far apart syntactically, you might want to break that down into smaller functions. A conditional block that spans more than a screenful of text is hard to read.
There's no hard and fast rule that applies to every language. Something like having a single return statement won't make your code good. But good code will tend to allow you to write your functions that way.
I've seen it in coding standards for C++ that were a hang-over from C, as if you don't have RAII or other automatic memory management then you have to clean up for each return, which either means cut-and-paste of the clean-up or a goto (logically the same as 'finally' in managed languages), both of which are considered bad form. If your practices are to use smart pointers and collections in C++ or another automatic memory system, then there isn't a strong reason for it, and it become all about readability, and more of a judgement call.
I lean to the idea that return statements in the middle of the function are bad. You can use returns to build a few guard clauses at the top of the function, and of course tell the compiler what to return at the end of the function without issue, but returns in the middle of the function can be easy to miss and can make the function harder to interpret.
Are there good reasons why it's a better practice to have only one return statement in a function?
Yes, there are:
The single exit point gives an excellent place to assert your post-conditions.
Being able to put a debugger breakpoint on the one return at the end of the function is often useful.
Fewer returns means less complexity. Linear code is generally simpler to understand.
If trying to simplify a function to a single return causes complexity, then that's incentive to refactor to smaller, more general, easier-to-understand functions.
If you're in a language without destructors or if you don't use RAII, then a single return reduces the number of places you have to clean up.
Some languages require a single exit point (e.g., Pascal and Eiffel).
The question is often posed as a false dichotomy between multiple returns or deeply nested if statements. There's almost always a third solution which is very linear (no deep nesting) with only a single exit point.
Update: Apparently MISRA guidelines promote single exit, too.
To be clear, I'm not saying it's always wrong to have multiple returns. But given otherwise equivalent solutions, there are lots of good reasons to prefer the one with a single return.
Having a single exit point does provide an advantage in debugging, because it allows you to set a single breakpoint at the end of a function to see what value is actually going to be returned.
In general I try to have only a single exit point from a function. There are times, however, that doing so actually ends up creating a more complex function body than is necessary, in which case it's better to have multiple exit points. It really has to be a "judgement call" based on the resulting complexity, but the goal should be as few exit points as possible without sacrificing complexity and understandability.
No, because we don't live in the 1970s any more. If your function is long enough that multiple returns are a problem, it's too long.
(Quite apart from the fact that any multi-line function in a language with exceptions will have multiple exit points anyway.)
My preference would be for single exit unless it really complicates things. I have found that in some cases, multiple exist points can mask other more significant design problems:
public void DoStuff(Foo foo)
{
if (foo == null) return;
}
On seeing this code, I would immediately ask:
Is 'foo' ever null?
If so, how many clients of 'DoStuff' ever call the function with a null 'foo'?
Depending on the answers to these questions it might be that
the check is pointless as it never is true (ie. it should be an assertion)
the check is very rarely true and so it may be better to change those specific caller functions as they should probably take some other action anyway.
In both of the above cases the code can probably be reworked with an assertion to ensure that 'foo' is never null and the relevant callers changed.
There are two other reasons (specific I think to C++ code) where multiple exists can actually have a negative affect. They are code size, and compiler optimizations.
A non-POD C++ object in scope at the exit of a function will have its destructor called. Where there are several return statements, it may be the case that there are different objects in scope and so the list of destructors to call will be different. The compiler therefore needs to generate code for each return statement:
void foo (int i, int j) {
A a;
if (i > 0) {
B b;
return ; // Call dtor for 'b' followed by 'a'
}
if (i == j) {
C c;
B b;
return ; // Call dtor for 'b', 'c' and then 'a'
}
return 'a' // Call dtor for 'a'
}
If code size is an issue - then this may be something worth avoiding.
The other issue relates to "Named Return Value OptimiZation" (aka Copy Elision, ISO C++ '03 12.8/15). C++ allows an implementation to skip calling the copy constructor if it can:
A foo () {
A a1;
// do something
return a1;
}
void bar () {
A a2 ( foo() );
}
Just taking the code as is, the object 'a1' is constructed in 'foo' and then its copy construct will be called to construct 'a2'. However, copy elision allows the compiler to construct 'a1' in the same place on the stack as 'a2'. There is therefore no need to "copy" the object when the function returns.
Multiple exit points complicates the work of the compiler in trying to detect this, and at least for a relatively recent version of VC++ the optimization did not take place where the function body had multiple returns. See Named Return Value Optimization in Visual C++ 2005 for more details.
Having a single exit point reduces Cyclomatic Complexity and therefore, in theory, reduces the probability that you will introduce bugs into your code when you change it. Practice however, tends to suggest that a more pragmatic approach is needed. I therefore tend to aim to have a single exit point, but allow my code to have several if that is more readable.
I force myself to use only one return statement, as it will in a sense generate code smell. Let me explain:
function isCorrect($param1, $param2, $param3) {
$toret = false;
if ($param1 != $param2) {
if ($param1 == ($param3 * 2)) {
if ($param2 == ($param3 / 3)) {
$toret = true;
} else {
$error = 'Error 3';
}
} else {
$error = 'Error 2';
}
} else {
$error = 'Error 1';
}
return $toret;
}
(The conditions are arbritary...)
The more conditions, the larger the function gets, the more difficult it is to read. So if you're attuned to the code smell, you'll realise it, and want to refactor the code. Two possible solutions are:
Multiple returns
Refactoring into separate functions
Multiple Returns
function isCorrect($param1, $param2, $param3) {
if ($param1 == $param2) { $error = 'Error 1'; return false; }
if ($param1 != ($param3 * 2)) { $error = 'Error 2'; return false; }
if ($param2 != ($param3 / 3)) { $error = 'Error 3'; return false; }
return true;
}
Separate Functions
function isEqual($param1, $param2) {
return $param1 == $param2;
}
function isDouble($param1, $param2) {
return $param1 == ($param2 * 2);
}
function isThird($param1, $param2) {
return $param1 == ($param2 / 3);
}
function isCorrect($param1, $param2, $param3) {
return !isEqual($param1, $param2)
&& isDouble($param1, $param3)
&& isThird($param2, $param3);
}
Granted, it is longer and a bit messy, but in the process of refactoring the function this way, we've
created a number of reusable functions,
made the function more human readable, and
the focus of the functions is on why the values are correct.
I would say you should have as many as required, or any that make the code cleaner (such as guard clauses).
I have personally never heard/seen any "best practices" say that you should have only one return statement.
For the most part, I tend to exit a function as soon as possible based on a logic path (guard clauses are an excellent example of this).
I believe that multiple returns are usually good (in the code that I write in C#). The single-return style is a holdover from C. But you probably aren't coding in C.
There is no law requiring only one exit point for a method in all programming languages. Some people insist on the superiority of this style, and sometimes they elevate it to a "rule" or "law" but this belief is not backed up by any evidence or research.
More than one return style may be a bad habit in C code, where resources have to be explicitly de-allocated, but languages such as Java, C#, Python or JavaScript that have constructs such as automatic garbage collection and try..finally blocks (and using blocks in C#), and this argument does not apply - in these languages, it is very uncommon to need centralised manual resource deallocation.
There are cases where a single return is more readable, and cases where it isn't. See if it reduces the number of lines of code, makes the logic clearer or reduces the number of braces and indents or temporary variables.
Therefore, use as many returns as suits your artistic sensibilities, because it is a layout and readability issue, not a technical one.
I have talked about this at greater length on my blog.
There are good things to say about having a single exit-point, just as there are bad things to say about the inevitable "arrow" programming that results.
If using multiple exit points during input validation or resource allocation, I try to put all the 'error-exits' very visibly at the top of the function.
Both the Spartan Programming article of the "SSDSLPedia" and the single function exit point article of the "Portland Pattern Repository's Wiki" have some insightful arguments around this. Also, of course, there is this post to consider.
If you really want a single exit-point (in any non-exception-enabled language) for example in order to release resources in one single place, I find the careful application of goto to be good; see for example this rather contrived example (compressed to save screen real-estate):
int f(int y) {
int value = -1;
void *data = NULL;
if (y < 0)
goto clean;
if ((data = malloc(123)) == NULL)
goto clean;
/* More code */
value = 1;
clean:
free(data);
return value;
}
Personally I, in general, dislike arrow programming more than I dislike multiple exit-points, although both are useful when applied correctly. The best, of course, is to structure your program to require neither. Breaking down your function into multiple chunks usually help :)
Although when doing so, I find I end up with multiple exit points anyway as in this example, where some larger function has been broken down into several smaller functions:
int g(int y) {
value = 0;
if ((value = g0(y, value)) == -1)
return -1;
if ((value = g1(y, value)) == -1)
return -1;
return g2(y, value);
}
Depending on the project or coding guidelines, most of the boiler-plate code could be replaced by macros. As a side note, breaking it down this way makes the functions g0, g1 ,g2 very easy to test individually.
Obviously, in an OO and exception-enabled language, I wouldn't use if-statements like that (or at all, if I could get away with it with little enough effort), and the code would be much more plain. And non-arrowy. And most of the non-final returns would probably be exceptions.
In short;
Few returns are better than many returns
More than one return is better than huge arrows, and guard clauses are generally ok.
Exceptions could/should probably replace most 'guard clauses' when possible.
You know the adage - beauty is in the eyes of the beholder.
Some people swear by NetBeans and some by IntelliJ IDEA, some by Python and some by PHP.
In some shops you could lose your job if you insist on doing this:
public void hello()
{
if (....)
{
....
}
}
The question is all about visibility and maintainability.
I am addicted to using boolean algebra to reduce and simplify logic and use of state machines. However, there were past colleagues who believed my employ of "mathematical techniques" in coding is unsuitable, because it would not be visible and maintainable. And that would be a bad practice. Sorry people, the techniques I employ is very visible and maintainable to me - because when I return to the code six months later, I would understand the code clearly rather seeing a mess of proverbial spaghetti.
Hey buddy (like a former client used to say) do what you want as long as you know how to fix it when I need you to fix it.
I remember 20 years ago, a colleague of mine was fired for employing what today would be called agile development strategy. He had a meticulous incremental plan. But his manager was yelling at him "You can't incrementally release features to users! You must stick with the waterfall." His response to the manager was that incremental development would be more precise to customer's needs. He believed in developing for the customers needs, but the manager believed in coding to "customer's requirement".
We are frequently guilty for breaking data normalization, MVP and MVC boundaries. We inline instead of constructing a function. We take shortcuts.
Personally, I believe that PHP is bad practice, but what do I know. All the theoretical arguments boils down to trying fulfill one set of rules
quality = precision, maintainability
and profitability.
All other rules fade into the background. And of course this rule never fades:
Laziness is the virtue of a good
programmer.
I lean towards using guard clauses to return early and otherwise exit at the end of a method. The single entry and exit rule has historical significance and was particularly helpful when dealing with legacy code that ran to 10 A4 pages for a single C++ method with multiple returns (and many defects). More recently, accepted good practice is to keep methods small which makes multiple exits less of an impedance to understanding. In the following Kronoz example copied from above, the question is what occurs in //Rest of code...?:
void string fooBar(string s, int? i) {
if(string.IsNullOrEmpty(s) || i == null) return null;
var res = someFunction(s, i);
foreach(var r in res) {
if(!r.Passed) return null;
}
// Rest of code...
return ret;
}
I realise the example is somewhat contrived but I would be tempted to refactor the foreach loop into a LINQ statement that could then be considered a guard clause. Again, in a contrived example the intent of the code isn't apparent and someFunction() may have some other side effect or the result may be used in the // Rest of code....
if (string.IsNullOrEmpty(s) || i == null) return null;
if (someFunction(s, i).Any(r => !r.Passed)) return null;
Giving the following refactored function:
void string fooBar(string s, int? i) {
if (string.IsNullOrEmpty(s) || i == null) return null;
if (someFunction(s, i).Any(r => !r.Passed)) return null;
// Rest of code...
return ret;
}
One good reason I can think of is for code maintenance: you have a single point of exit. If you want to change the format of the result,..., it's just much simpler to implement. Also, for debugging, you can just stick a breakpoint there :)
Having said that, I once had to work in a library where the coding standards imposed 'one return statement per function', and I found it pretty tough. I write lots of numerical computations code, and there often are 'special cases', so the code ended up being quite hard to follow...
Multiple exit points are fine for small enough functions -- that is, a function that can be viewed on one screen length on its entirety. If a lengthy function likewise includes multiple exit points, it's a sign that the function can be chopped up further.
That said I avoid multiple-exit functions unless absolutely necessary. I have felt pain of bugs that are due to some stray return in some obscure line in more complex functions.
I've worked with terrible coding standards that forced a single exit path on you and the result is nearly always unstructured spaghetti if the function is anything but trivial -- you end up with lots of breaks and continues that just get in the way.
Single exit point - all other things equal - makes code significantly more readable.
But there's a catch: popular construction
resulttype res;
if if if...
return res;
is a fake, "res=" is not much better than "return". It has single return statement, but multiple points where function actually ends.
If you have function with multiple returns (or "res="s), it's often a good idea to break it into several smaller functions with single exit point.
My usual policy is to have only one return statement at the end of a function unless the complexity of the code is greatly reduced by adding more. In fact, I'm rather a fan of Eiffel, which enforces the only one return rule by having no return statement (there's just a auto-created 'result' variable to put your result in).
There certainly are cases where code can be made clearer with multiple returns than the obvious version without them would be. One could argue that more rework is needed if you have a function that is too complex to be understandable without multiple return statements, but sometimes it's good to be pragmatic about such things.
If you end up with more than a few returns there may be something wrong with your code. Otherwise I would agree that sometimes it is nice to be able to return from multiple places in a subroutine, especially when it make the code cleaner.
Perl 6: Bad Example
sub Int_to_String( Int i ){
given( i ){
when 0 { return "zero" }
when 1 { return "one" }
when 2 { return "two" }
when 3 { return "three" }
when 4 { return "four" }
...
default { return undef }
}
}
would be better written like this
Perl 6: Good Example
#Int_to_String = qw{
zero
one
two
three
four
...
}
sub Int_to_String( Int i ){
return undef if i < 0;
return undef unless i < #Int_to_String.length;
return #Int_to_String[i]
}
Note this is was just a quick example
I vote for Single return at the end as a guideline. This helps a common code clean-up handling ... For example, take a look at the following code ...
void ProcessMyFile (char *szFileName)
{
FILE *fp = NULL;
char *pbyBuffer = NULL:
do {
fp = fopen (szFileName, "r");
if (NULL == fp) {
break;
}
pbyBuffer = malloc (__SOME__SIZE___);
if (NULL == pbyBuffer) {
break;
}
/*** Do some processing with file ***/
} while (0);
if (pbyBuffer) {
free (pbyBuffer);
}
if (fp) {
fclose (fp);
}
}
This is probably an unusual perspective, but I think that anyone who believes that multiple return statements are to be favoured has never had to use a debugger on a microprocessor that supports only 4 hardware breakpoints. ;-)
While the issues of "arrow code" are completely correct, one issue that seems to go away when using multiple return statements is in the situation where you are using a debugger. You have no convenient catch-all position to put a breakpoint to guarantee that you're going to see the exit and hence the return condition.
The more return statements you have in a function, the higher complexity in that one method. If you find yourself wondering if you have too many return statements, you might want to ask yourself if you have too many lines of code in that function.
But, not, there is nothing wrong with one/many return statements. In some languages, it is a better practice (C++) than in others (C).

Are there any legitimate use-cases for "goto" in a language that supports loops and functions?

I've long been under the impression that goto should never be used if possible.
However, while perusing libavcodec (which is written in C) the other day, I was surprised to notice multiple uses of it.
Is it ever advantageous to use goto in a language that supports loops and functions? If so, why? Please provide a concrete example that clearly justifies the use of a goto.
Everybody who is anti-goto cites, directly or indirectly, Edsger Dijkstra's GoTo Considered Harmful article to substantiate their position. Too bad Dijkstra's article has virtually nothing to do with the way goto statements are used these days and thus what the article says has little to no applicability to the modern programming scene. The goto-less meme verges now on a religion, right down to its scriptures dictated from on high, its high priests and the shunning (or worse) of perceived heretics.
Let's put Dijkstra's paper into context to shed a little light on the subject.
When Dijkstra wrote his paper the popular languages of the time were unstructured procedural ones like BASIC, FORTRAN (the earlier dialects) and various assembly languages. It was quite common for people using the higher-level languages to jump all over their code base in twisted, contorted threads of execution that gave rise to the term "spaghetti code". You can see this by hopping on over to the classic Trek game written by Mike Mayfield and trying to figure out how things work. Take a few moments to look that over.
THIS is "the unbridled use of the go to statement" that Dijkstra was railing against in his paper in 1968. THIS is the environment he lived in that led him to write that paper. The ability to jump anywhere you like in your code at any point you liked was what he was criticising and demanding be stopped. Comparing that to the anaemic powers of goto in C or other such more modern languages is simply risible.
I can already hear the raised chants of the cultists as they face the heretic. "But," they will chant, "you can make code very difficult to read with goto in C." Oh yeah? You can make code very difficult to read without goto as well. Like this one:
#define _ -F<00||--F-OO--;
int F=00,OO=00;main(){F_OO();printf("%1.3f\n",4.*-F/OO/OO);}F_OO()
{
_-_-_-_
_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_
_-_-_-_
}
Not a goto in sight, so it must be easy to read, right? Or how about this one:
a[900]; b;c;d=1 ;e=1;f; g;h;O; main(k,
l)char* *l;{g= atoi(* ++l); for(k=
0;k*k< g;b=k ++>>1) ;for(h= 0;h*h<=
g;++h); --h;c=( (h+=g>h *(h+1)) -1)>>1;
while(d <=g){ ++O;for (f=0;f< O&&d<=g
;++f)a[ b<<5|c] =d++,b+= e;for( f=0;f<O
&&d<=g; ++f)a[b <<5|c]= d++,c+= e;e= -e
;}for(c =0;c<h; ++c){ for(b=0 ;b<k;++
b){if(b <k/2)a[ b<<5|c] ^=a[(k -(b+1))
<<5|c]^= a[b<<5 |c]^=a[ (k-(b+1 ))<<5|c]
;printf( a[b<<5|c ]?"%-4d" :" " ,a[b<<5
|c]);} putchar( '\n');}} /*Mike Laman*/
No goto there either. It must therefore be readable.
What's my point with these examples? It's not language features that make unreadable, unmaintainable code. It's not syntax that does it. It's bad programmers that cause this. And bad programmers, as you can see in that above item, can make any language feature unreadable and unusable. Like the for loops up there. (You can see them, right?)
Now to be fair, some language constructs are easier to abuse than others. If you're a C programmer, however, I'd peer far more closely at about 50% of the uses of #define long before I'd go on a crusade against goto!
So, for those who've bothered to read this far, there are several key points to note.
Dijkstra's paper on goto statements was written for a programming environment where goto was a lot
more potentially damaging than it is in most modern languages that aren't an assembler.
Automatically throwing away all uses of goto because of this is about as rational as saying "I tried
to have fun once but didn't like it so now I'm against it".
There are legitimate uses of the modern (anaemic) goto statements in code that cannot be adequately
replaced by other constructs.
There are, of course, illegitimate uses of the same statements.
There are, too, illegitimate uses of the modern control statements like the "godo" abomination where an always-false do loop is broken out of using break in place of a goto. These are often worse than judicious use of goto.
There are a few reasons for using the "goto" statement that I'm aware of (some have spoken to this already):
Cleanly exiting a function
Often in a function, you may allocate resources and need to exit in multiple places. Programmers can simplify their code by putting the resource cleanup code at the end of the function, and all "exit points" of the function would goto the cleanup label. This way, you don't have to write cleanup code at every "exit point" of the function.
Exiting nested loops
If you're in a nested loop and need to break out of all loops, a goto can make this much cleaner and simpler than break statements and if-checks.
Low-level performance improvements
This is only valid in perf-critical code, but goto statements execute very quickly and can give you a boost when moving through a function. This is a double-edged sword, however, because a compiler typically cannot optimize code that contains gotos.
Note that in all these examples, gotos are restricted to the scope of a single function.
Obeying best practices blindly is not a best practice. The idea of avoiding goto statements as one's primary form of flow control is to avoid producing unreadable spaghetti code. If used sparingly in the right places, they can sometimes be the simplest, clearest way of expressing an idea. Walter Bright, the creator of the Zortech C++ compiler and the D programming language, uses them frequently, but judiciously. Even with the goto statements, his code is still perfectly readable.
Bottom line: Avoiding goto for the sake of avoiding goto is pointless. What you really want to avoid is producing unreadable code. If your goto-laden code is readable, then there's nothing wrong with it.
Well, there's one thing that's always worse than goto's; strange use of other programflow operators to avoid a goto:
Examples:
// 1
try{
...
throw NoErrorException;
...
} catch (const NoErrorException& noe){
// This is the worst
}
// 2
do {
...break;
...break;
} while (false);
// 3
for(int i = 0;...) {
bool restartOuter = false;
for (int j = 0;...) {
if (...)
restartOuter = true;
if (restartOuter) {
i = -1;
}
}
etc
etc
Since goto makes reasoning about program flow hard1 (aka. “spaghetti code”), goto is generally only used to compensate for missing features: The use of goto may actually be acceptable, but only if the language doesn't offer a more structured variant to obtain the same goal. Take Doubt's example:
The rule with goto that we use is that goto is okay to for jumping forward to a single exit cleanup point in a function.
This is true – but only if the language doesn't allow structured exception handling with cleanup code (such as RAII or finally), which does the same job better (as it is specially built for doing it), or when there's a good reason not to employ structured exception handling (but you will never have this case except at a very low level).
In most other languages, the only acceptable use of goto is to exit nested loops. And even there it is almost always better to lift the outer loop into an own method and use return instead.
Other than that, goto is a sign that not enough thought has gone into the particular piece of code.
1 Modern languages which support goto implement some restrictions (e.g. goto may not jump into or out of functions) but the problem fundamentally remains the same.
Incidentally, the same is of course also true for other language features, most notably exceptions. And there are usually strict rules in place to only use these features where indicated, such as the rule not to use exceptions to control non-exceptional program flow.
In C# switch statement doest not allow fall-through. So goto is used to transfer control to a specific switch-case label or the default label.
For example:
switch(value)
{
case 0:
Console.WriteLine("In case 0");
goto case 1;
case 1:
Console.WriteLine("In case 1");
goto case 2;
case 2:
Console.WriteLine("In case 2");
goto default;
default:
Console.WriteLine("In default");
break;
}
Edit: There is one exception on "no fall-through" rule. Fall-through is allowed if a case statement has no code.
I've written more than a few lines of assembly language over the years. Ultimately, every high level language compiles down to gotos. Okay, call them "branches" or "jumps" or whatever else, but they're gotos. Can anyone write goto-less assembler?
Now sure, you can point out to a Fortran, C or BASIC programmer that to run riot with gotos is a recipe for spaghetti bolognaise. The answer however is not to avoid them, but to use them carefully.
A knife can be used to prepare food, free someone, or kill someone. Do we do without knives through fear of the latter? Similarly the goto: used carelessly it hinders, used carefully it helps.
#ifdef TONGUE_IN_CHEEK
Perl has a goto that allows you to implement poor-man's tail calls. :-P
sub factorial {
my ($n, $acc) = (#_, 1);
return $acc if $n < 1;
#_ = ($n - 1, $acc * $n);
goto &factorial;
}
#endif
Okay, so that has nothing to do with C's goto. More seriously, I agree with the other comments about using goto for cleanups, or for implementing Duff's device, or the like. It's all about using, not abusing.
(The same comment can apply to longjmp, exceptions, call/cc, and the like---they have legitimate uses, but can easily be abused. For example, throwing an exception purely to escape a deeply-nested control structure, under completely non-exceptional circumstances.)
Take a look at When To Use Goto When Programming in C:
Although the use of goto is almost always bad programming practice (surely you can find a better way of doing XYZ), there are times when it really isn't a bad choice. Some might even argue that, when it is useful, it's the best choice.
Most of what I have to say about goto really only applies to C. If you're using C++, there's no sound reason to use goto in place of exceptions. In C, however, you don't have the power of an exception handling mechanism, so if you want to separate out error handling from the rest of your program logic, and you want to avoid rewriting clean up code multiple times throughout your code, then goto can be a good choice.
What do I mean? You might have some code that looks like this:
int big_function()
{
/* do some work */
if([error])
{
/* clean up*/
return [error];
}
/* do some more work */
if([error])
{
/* clean up*/
return [error];
}
/* do some more work */
if([error])
{
/* clean up*/
return [error];
}
/* do some more work */
if([error])
{
/* clean up*/
return [error];
}
/* clean up*/
return [success];
}
This is fine until you realize that you need to change your cleanup code. Then you have to go through and make 4 changes. Now, you might decide that you can just encapsulate all of the cleanup into a single function; that's not a bad idea. But it does mean that you'll need to be careful with pointers -- if you plan to free a pointer in your cleanup function, there's no way to set it to then point to NULL unless you pass in a pointer to a pointer. In a lot of cases, you won't be using that pointer again anyway, so that may not be a major concern. On the other hand, if you add in a new pointer, file handle, or other thing that needs cleanup, then you'll need to change your cleanup function again; and then you'll need to change the arguments to that function.
By using goto, it will be
int big_function()
{
int ret_val = [success];
/* do some work */
if([error])
{
ret_val = [error];
goto end;
}
/* do some more work */
if([error])
{
ret_val = [error];
goto end;
}
/* do some more work */
if([error])
{
ret_val = [error];
goto end;
}
/* do some more work */
if([error])
{
ret_val = [error];
goto end;
}
end:
/* clean up*/
return ret_val;
}
The benefit here is that your code following end has access to everything it will need to perform cleanup, and you've managed to reduce the number of change points considerably. Another benefit is that you've gone from having multiple exit points for your function to just one; there's no chance you'll accidentally return from the function without cleaning up.
Moreover, since goto is only being used to jump to a single point, it's not as though you're creating a mass of spaghetti code jumping back and forth in an attempt to simulate function calls. Rather, goto actually helps write more structured code.
In a word, goto should always be used sparingly, and as a last resort -- but there is a time and a place for it. The question should be not "do you have to use it" but "is it the best choice" to use it.
The rule with goto that we use is that goto is okay to for jumping forward to a single exit cleanup point in a function. In really complex functions we relax that rule to allow other jump forwards. In both cases we are avoiding deeply nested if statements that often occur with error code checking, which helps readability and maintance.
The most thoughtful and thorough discussion of goto statements, their legitimate uses, and alternative constructs that can be used in place of "virtuous goto statements" but can be abused as easily as goto statements, is Donald Knuth's article "Structured Programming with goto Statements", in the December 1974 Computing Surveys (volume 6, no. 4. pp. 261 - 301).
Not surprisingly, some aspects of this 39-year old paper are dated: Orders-of-magnitude increases in processing power make some of Knuth's performance improvements unnoticeable for moderately sized problems, and new programming-language constructs have been invented since then. (For example, try-catch blocks subsume Zahn's Construct, although they are rarely used in that way.) But Knuth covers all sides of the argument, and should be required reading before anyone rehashes the issue yet again.
I find it funny that some people will go as far as to give a list of cases where goto is acceptable, saying that all other uses are unacceptable. Do you really think that you know every case where goto is the best choice for expressing an algorithm?
To illustrate, I'll give you an example that no one here has shown yet:
Today I was writing code for inserting an element in a hash table. The hash table is a cache of previous calculations which can be overwritten at will (affecting performance but not correctness).
Each bucket of the hash table has 4 slots, and I have a bunch of criteria to decide which element to overwrite when a bucket is full. Right now this means making up to three passes through a bucket, like this:
// Overwrite an element with same hash key if it exists
for (add_index=0; add_index < ELEMENTS_PER_BUCKET; add_index++)
if (slot_p[add_index].hash_key == hash_key)
goto add;
// Otherwise, find first empty element
for (add_index=0; add_index < ELEMENTS_PER_BUCKET; add_index++)
if ((slot_p[add_index].type == TT_ELEMENT_EMPTY)
goto add;
// Additional passes go here...
add:
// element is written to the hash table here
Now if I didn't use goto, what would this code look like?
Something like this:
// Overwrite an element with same hash key if it exists
for (add_index=0; add_index < ELEMENTS_PER_BUCKET; add_index++)
if (slot_p[add_index].hash_key == hash_key)
break;
if (add_index >= ELEMENTS_PER_BUCKET) {
// Otherwise, find first empty element
for (add_index=0; add_index < ELEMENTS_PER_BUCKET; add_index++)
if ((slot_p[add_index].type == TT_ELEMENT_EMPTY)
break;
if (add_index >= ELEMENTS_PER_BUCKET)
// Additional passes go here (nested further)...
}
// element is written to the hash table here
It would look worse and worse if more passes are added, while the version with goto keeps the same indentation level at all times and avoids the use of spurious if statements whose result is implied by the execution of the previous loop.
So there's another case where goto makes the code cleaner and easier to write and understand... I'm sure there are many more, so don't pretend to know all the cases where goto is useful, dissing any good ones that you couldn't think of.
One of the reasons goto is bad, besides coding style is that you can use it to create overlapping, but non-nested loops:
loop1:
a
loop2:
b
if(cond1) goto loop1
c
if(cond2) goto loop2
This would create the bizarre, but possibly legal flow-of-control structure where a sequence like (a, b, c, b, a, b, a, b, ...) is possible, which makes compiler hackers unhappy. Apparently there are a number of clever optimization tricks that rely on this type of structure not occuring. (I should check my copy of the dragon book...) The result of this might (using some compilers) be that other optimizations aren't done for code that contains gotos.
It might be useful if you know it just, "oh, by the way", happens to persuade the compiler to emit faster code. Personally, I'd prefer to try to explain to the compiler about what's probable and what's not before using a trick like goto, but arguably, I might also try goto before hacking assembler.
Some say there is no reason for goto in C++. Some say that in 99% cases there are better alternatives. This is not reasoning, just irrational impressions. Here's a solid example where goto leads to a nice code, something like enhanced do-while loop:
int i;
PROMPT_INSERT_NUMBER:
std::cout << "insert number: ";
std::cin >> i;
if(std::cin.fail()) {
std::cin.clear();
std::cin.ignore(1000,'\n');
goto PROMPT_INSERT_NUMBER;
}
std::cout << "your number is " << i;
Compare it to goto-free code:
int i;
bool loop;
do {
loop = false;
std::cout << "insert number: ";
std::cin >> i;
if(std::cin.fail()) {
std::cin.clear();
std::cin.ignore(1000,'\n');
loop = true;
}
} while(loop);
std::cout << "your number is " << i;
I see these differences:
nested {} block is needed (albeit do {...} while looks more familiar)
extra loop variable is needed, used in four places
it takes longer time to read and understand the work with the loop
the loop does not hold any data, it just controls the flow of the execution, which is less comprehensible than simple label
There is another example
void sort(int* array, int length) {
SORT:
for(int i=0; i<length-1; ++i) if(array[i]>array[i+1]) {
swap(data[i], data[i+1]);
goto SORT; // it is very easy to understand this code, right?
}
}
Now let's get rid of the "evil" goto:
void sort(int* array, int length) {
bool seemslegit;
do {
seemslegit = true;
for(int i=0; i<length-1; ++i) if(array[i]>array[i+1]) {
swap(data[i], data[i+1]);
seemslegit = false;
}
} while(!seemslegit);
}
You see it is the same type of using goto, it is well structured pattern and it is not forward goto as many promote as the only recommended way. Surely you want to avoid "smart" code like this:
void sort(int* array, int length) {
for(int i=0; i<length-1; ++i) if(array[i]>array[i+1]) {
swap(data[i], data[i+1]);
i = -1; // it works, but WTF on the first glance
}
}
The point is that goto can be easily misused, but goto itself is not to blame. Note that label has function scope in C++, so it does not pollute global scope like in pure assembly, in which overlapping loops have its place and are very common - like in the following code for 8051, where 7segment display is connected to P1. The program loops lightning segment around:
; P1 states loops
; 11111110 <-
; 11111101 |
; 11111011 |
; 11110111 |
; 11101111 |
; 11011111 |
; |_________|
init_roll_state:
MOV P1,#11111110b
ACALL delay
next_roll_state:
MOV A,P1
RL A
MOV P1,A
ACALL delay
JNB P1.5, init_roll_state
SJMP next_roll_state
There is another advantage: goto can serve as named loops, conditions and other flows:
if(valid) {
do { // while(loop)
// more than one page of code here
// so it is better to comment the meaning
// of the corresponding curly bracket
} while(loop);
} // if(valid)
Or you can use equivalent goto with indentation, so you don't need comment if you choose the label name wisely:
if(!valid) goto NOTVALID;
LOOPBACK:
// more than one page of code here
if(loop) goto LOOPBACK;
NOTVALID:;
I have come across a situation where a goto was a good solution, and I have not seen this example here or anywhere.
I had a switch case with a few cases which all needed to call the same function in the end. I had other cases which all needed to call a different function in the end.
This looked a bit like this:
switch( x ) {
case 1: case1() ; doStuffFor123() ; break ;
case 2: case2() ; doStuffFor123() ; break ;
case 3: case3() ; doStuffFor123() ; break ;
case 4: case4() ; doStuffFor456() ; break ;
case 5: case5() ; doStuffFor456() ; break ;
case 6: case6() ; doStuffFor456() ; break ;
case 7: case7() ; doStuffFor789() ; break ;
case 8: case8() ; doStuffFor789() ; break ;
case 9: case9() ; doStuffFor789() ; break ;
}
Instead of giving every case a function call, I replaced the break by a goto. The goto jumps to a label which is also inside the switch case.
switch( x ) {
case 1: case1() ; goto stuff123 ;
case 2: case2() ; goto stuff123 ;
case 3: case3() ; goto stuff123 ;
case 4: case4() ; goto stuff456 ;
case 5: case5() ; goto stuff456 ;
case 6: case6() ; goto stuff456 ;
case 7: case7() ; goto stuff789 ;
case 8: case8() ; goto stuff789 ;
case 9: case9() ; goto stuff789 ;
stuff123: doStuffFor123() ; break ;
stuff456: doStuffFor456() ; break ;
stuff789: doStuffFor789() ; break ;
}
cases 1 through 3 all must call doStuffFor123() and similarly cases 4 through 6 had to call doStuffFor456() etc.
In my opinion, gotos are perfectly fine if you use them correctly. In the end, any code is as clear as people write it. With gotos one can make spaghetti code, but that does not mean that gotos are the cause of the spaghetti code. That cause is us; programmers. I can also create spaghetti code with functions if I want to. The same goes for macros as well.
In a Perl module, you occasionally want to create subroutines or closures on the fly. The thing is, that once you have created the subroutine, how do you get to it. You could just call it, but then if the subroutine uses caller() it won't be as helpful as it could be. That is where the goto &subroutine variation can be helpful.
Here is a quick example:
sub AUTOLOAD{
my($self) = #_;
my $name = $AUTOLOAD;
$name =~ s/.*:://;
*{$name} = my($sub) = sub{
# the body of the closure
}
goto $sub;
# nothing after the goto will ever be executed.
}
You can also use this form of goto to provide a rudimentary form of tail-call optimization.
sub factorial($){
my($n,$tally) = (#_,1);
return $tally if $n <= 1;
$tally *= $n--;
#_ = ($n,$tally);
goto &factorial;
}
( In Perl 5 version 16 that would be better written as goto __SUB__; )
There is a module that will import a tail modifier and one that will import recur if you don't like using this form of goto.
use Sub::Call::Tail;
sub AUTOLOAD {
...
tail &$sub( #_ );
}
use Sub::Call::Recur;
sub factorial($){
my($n,$tally) = (#_,1);
return $tally if $n <= 1;
recur( $n-1, $tally * $n );
}
Most of the other reasons to use goto are better done with other keywords.
Like redoing a bit of code:
LABEL: ;
...
goto LABEL if $x;
{
...
redo if $x;
}
Or going to the last of a bit of code from multiple places:
goto LABEL if $x;
...
goto LABEL if $y;
...
LABEL: ;
{
last if $x;
...
last if $y
...
}
1) The most common use of goto that I know of is emulating exception handling in languages that don't offer it, namely in C. (The code given by Nuclear above is just that.) Look at the Linux source code and you'll see a bazillion gotos used that way; there were about 100,000 gotos in Linux code according to a quick survey conducted in 2013: http://blog.regehr.org/archives/894. Goto usage is even mentioned in the Linux coding style guide: https://www.kernel.org/doc/Documentation/CodingStyle. Just like object-oriented programming is emulated using structs populated with function pointers, goto has its place in C programming. So who is right: Dijkstra or Linus (and all Linux kernel coders)? It's theory vs. practice basically.
There is however the usual gotcha for not having compiler-level support and checks for common constructs/patterns: it's easier to use them wrong and introduce bugs without compile-time checks. Windows and Visual C++ but in C mode offer exception handling via SEH/VEH for this very reason: exceptions are useful even outside OOP languages, i.e. in a procedural language. But the compiler can't always save your bacon, even if it offers syntactic support for exceptions in the language. Consider as example of the latter case the famous Apple SSL "goto fail" bug, which just duplicated one goto with disastrous consequences (https://www.imperialviolet.org/2014/02/22/applebug.html):
if (something())
goto fail;
goto fail; // copypasta bug
printf("Never reached\n");
fail:
// control jumps here
You can have exactly the same bug using compiler-supported exceptions, e.g. in C++:
struct Fail {};
try {
if (something())
throw Fail();
throw Fail(); // copypasta bug
printf("Never reached\n");
}
catch (Fail&) {
// control jumps here
}
But both variants of the bug can be avoided if the compiler analyzes and warns you about unreachable code. For example compiling with Visual C++ at the /W4 warning level finds the bug in both cases. Java for instance forbids unreachable code (where it can find it!) for a pretty good reason: it's likely to be a bug in the average Joe's code. As long as the goto construct doesn't allow targets that the compiler can't easily figure out, like gotos to computed addresses(**), it's not any harder for the compiler to find unreachable code inside a function with gotos than using Dijkstra-approved code.
(**) Footnote: Gotos to computed line numbers are possible in some versions of Basic, e.g. GOTO 10*x where x is a variable. Rather confusingly, in Fortran "computed goto" refers to a construct that is equivalent to a switch statement in C. Standard C doesn't allow computed gotos in the language, but only gotos to statically/syntactically declared labels. GNU C however has an extension to get the address of a label (the unary, prefix && operator) and also allows a goto to a variable of type void*. See https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html for more on this obscure sub-topic. The rest of this post ins't concerned with that obscure GNU C feature.
Standard C (i.e. not computed) gotos are not usually the reason why unreachable code can't be found at compile time. The usual reason is logic code like the following. Given
int computation1() {
return 1;
}
int computation2() {
return computation1();
}
It's just as hard for a compiler to find unreachable code in any of the following 3 constructs:
void tough1() {
if (computation1() != computation2())
printf("Unreachable\n");
}
void tough2() {
if (computation1() == computation2())
goto out;
printf("Unreachable\n");
out:;
}
struct Out{};
void tough3() {
try {
if (computation1() == computation2())
throw Out();
printf("Unreachable\n");
}
catch (Out&) {
}
}
(Excuse my brace-related coding style, but I tried to keep the examples as compact as possible.)
Visual C++ /W4 (even with /Ox) fails to find unreachable code in any of these, and as you probably know the problem of finding unreachable code is undecidable in general. (If you don't believe me about that: https://www.cl.cam.ac.uk/teaching/2006/OptComp/slides/lecture02.pdf)
As a related issue, the C goto can be used to emulate exceptions only inside the body of a function. The standard C library offers a setjmp() and longjmp() pair of functions for emulating non-local exits/exceptions, but those have some serious drawbacks compared to what other languages offer. The Wikipedia article http://en.wikipedia.org/wiki/Setjmp.h explains fairly well this latter issue. This function pair also works on Windows (http://msdn.microsoft.com/en-us/library/yz2ez4as.aspx), but hardly anyone uses them there because SEH/VEH is superior. Even on Unix, I think setjmp and longjmp are very seldom used.
2) I think the second most common use of goto in C is implementing multi-level break or multi-level continue, which is also a fairly uncontroversial use case. Recall that Java doesn't allow goto label, but allows break label or continue label. According to http://www.oracle.com/technetwork/java/simple-142616.html, this is actually the most common use case of gotos in C (90% they say), but in my subjective experience, system code tends to use gotos for error handling more often. Perhaps in scientific code or where the OS offers exception handling (Windows) then multi-level exits are the dominant use case. They don't really give any details as to the context of their survey.
Edited to add: it turns out these two use patterns are found in the C book of Kernighan and Ritchie, around page 60 (depending on edition). Another thing of note is that both use cases involve only forward gotos. And it turns out that MISRA C 2012 edition (unlike the 2004 edition) now permits gotos, as long as they are only forward ones.
If so, why?
C has no multi-level/labelled break, and not all control flows can be easily modelled with C's iteration and decision primitives. gotos go a long way towards redressing these flaws.
Sometimes it's clearer to use a flag variable of some kind to effect a kind of pseudo-multi-level break, but it's not always superior to the goto (at least a goto allows one to easily determine where control goes to, unlike a flag variable), and sometimes you simply don't want to pay the performance price of flags/other contortions to avoid the goto.
libavcodec is a performance-sensitive piece of code. Direct expression of the control flow is probably a priority, because it'll tend to run better.
I find the do{} while(false) usage utterly revolting. It is conceivable might convince me it is necessary in some odd case, but never that it is clean sensible code.
If you must do some such loop, why not make the dependence on the flag variable explicit?
for (stepfailed=0 ; ! stepfailed ; /*empty*/)
The GOTO can be used, of course, but there is one more important thing than the code style, or if the code is or not readable that you must have in mind when you use it: the code inside may not be as robust as you think.
For instance, look at the following two code snippets:
If A <> 0 Then A = 0 EndIf
Write("Value of A:" + A)
An equivalent code with GOTO
If A == 0 Then GOTO FINAL EndIf
A = 0
FINAL:
Write("Value of A:" + A)
The first thing we think is that the result of both bits of code will be that "Value of A: 0" (we suppose an execution without parallelism, of course)
That's not correct: in the first sample, A will always be 0, but in the second sample (with the GOTO statement) A might not be 0. Why?
The reason is because from another point of the program I can insert a GOTO FINAL without controlling the value of A.
This example is very obvious, but as programs get more complicated, the difficulty of seeing those kind of things increases.
Related material can be found into the famous article from Mr. Dijkstra "A case against the GO TO statement"
It comes in handy for character-wise string processing from time to time.
Imagine something like this printf-esque example:
for cur_char, next_char in sliding_window(input_string) {
if cur_char == '%' {
if next_char == '%' {
cur_char_index += 1
goto handle_literal
}
# Some additional logic
if chars_should_be_handled_literally() {
goto handle_literal
}
# Handle the format
}
# some other control characters
else {
handle_literal:
# Complicated logic here
# Maybe it's writing to an array for some OpenGL calls later or something,
# all while modifying a bunch of local variables declared outside the loop
}
}
You could refactor that goto handle_literal to a function call, but if it's modifying several different local variables, you'd have to pass references to each unless your language supports mutable closures. You'd still have to use a continue statement (which is arguably a form of goto) after the call to get the same semantics if your logic makes an else case not work.
I have also used gotos judiciously in lexers, typically for similar cases. You don't need them most of the time, but they're nice to have for those weird cases.
In Perl, use of a label to "goto" from a loop - using a "last" statement, which is similar to break.
This allows better control over nested loops.
The traditional goto label is supported too, but I'm not sure there are too many instances where this is the only way to achieve what you want - subroutines and loops should suffice for most cases.
I use goto in the following case:
when needed to return from funcions at different places, and before return some uninitialization needs to be done:
non-goto version:
int doSomething (struct my_complicated_stuff *ctx)
{
db_conn *conn;
RSA *key;
char *temp_data;
conn = db_connect();
if (ctx->smth->needs_alloc) {
temp_data=malloc(ctx->some_size);
if (!temp_data) {
db_disconnect(conn);
return -1;
}
}
...
if (!ctx->smth->needs_to_be_processed) {
free(temp_data);
db_disconnect(conn);
return -2;
}
pthread_mutex_lock(ctx->mutex);
if (ctx->some_other_thing->error) {
pthread_mutex_unlock(ctx->mutex);
free(temp_data);
db_disconnect(conn);
return -3;
}
...
key=rsa_load_key(....);
...
if (ctx->something_else->error) {
rsa_free(key);
pthread_mutex_unlock(ctx->mutex);
free(temp_data);
db_disconnect(conn);
return -4;
}
if (ctx->something_else->additional_check) {
rsa_free(key);
pthread_mutex_unlock(ctx->mutex);
free(temp_data);
db_disconnect(conn);
return -5;
}
pthread_mutex_unlock(ctx->mutex);
free(temp_data);
db_disconnect(conn);
return 0;
}
goto version:
int doSomething_goto (struct my_complicated_stuff *ctx)
{
int ret=0;
db_conn *conn;
RSA *key;
char *temp_data;
conn = db_connect();
if (ctx->smth->needs_alloc) {
temp_data=malloc(ctx->some_size);
if (!temp_data) {
ret=-1;
goto exit_db;
}
}
...
if (!ctx->smth->needs_to_be_processed) {
ret=-2;
goto exit_freetmp;
}
pthread_mutex_lock(ctx->mutex);
if (ctx->some_other_thing->error) {
ret=-3;
goto exit;
}
...
key=rsa_load_key(....);
...
if (ctx->something_else->error) {
ret=-4;
goto exit_freekey;
}
if (ctx->something_else->additional_check) {
ret=-5;
goto exit_freekey;
}
exit_freekey:
rsa_free(key);
exit:
pthread_mutex_unlock(ctx->mutex);
exit_freetmp:
free(temp_data);
exit_db:
db_disconnect(conn);
return ret;
}
The second version makes it easier, when you need to change something in the deallocation statements (each is used once in the code), and reduces the chance to skip any of them, when adding a new branch. Moving them in a function will not help here, because the deallocation can be done at different "levels".
Use "goto" wherever it makes your code more readable or run faster. Just don't let it turn your code into spaghetti.
The problem with 'goto' and the most important argument of the 'goto-less programming' movement is, that if you use it too frequently your code, although it might behave correctly, becomes unreadable, unmaintainable, unreviewable etc. In 99.99% of the cases 'goto' leads to spaghetti code. Personally, I cannot think of any good reason as to why I would use 'goto'.
Edsger Dijkstra, a computer scientist that had major contributions on the field, was also famous for criticizing the use of GoTo.
There's a short article about his argument on Wikipedia.