Exception handling: does more statements in a try block lead to fallible code? - exception

In many languages exception handling uses two or more blocks of code, try and one or more catch's.
Exception handling in V language (v0.2.4), although different syntax only allows a single statement in try. One of the developers motivation is that the language won't allow try blocks because that is "error prone".
I can't see any arguments that supports this so my question is: does allowing more statements in a try block lead to fallible code?
I see a quite big drawback with only one statement in a try: you may end up with more code duplication in cases where you need to attach exact same error handler. This can be alleviated to some degree by moving code to a error handling function, in some cases this is an unnecessary function declaration that you'd need to call multiple times following another and it looks quite stupid.
On a side note, one benefit with only one statement is that you don't need to declare the variable outside, this can be useful in some cases.

Related

Why use an exception instead of if...else

For example, in the case of "The array index out of bound" exception, why don't we check the array length in advance:
if(array.length < countNum)
{
//logic
}
else
{
//replace using exception
}
My question is, why choose to use an exception? and when to use an exception, instead of if-else
Thanks.
It depends on acceptable practices for a given language.
In Java, the convention is to always check conditions whenever possible and not to use exceptions for flow control. But, for example, in Python not only using exception in this manner is acceptable, but it is also a preferred practice.
They are used to inform the code that calls your code an exceptional condition occurred. Exceptions are more expensive than well formed if/else logic so you use them in exceptional circumstances such as reaching a condition in your code you cannot handle locally, or to support giving the caller of your code the choice of how to handle the error condition.
Usually if you find yourself throwing and catching exceptions in your own function or method, you can probably find a more efficient way of doing it.
There are many answers to that question. As a single example, from Java, when you are using multiple threads, sometimes you need to interrupt a thread, and the thread will see this when an InterruptedException is thrown.
Other times, you will be using an API that throws certain exceptions. You won't be able to avoid it. If the API throws, for example, an IOException, then you can catch it, or let it bubble up.
Here's an example where it would actually be better to use an exception instead of a conditional.
Say you had a list of 10,000 strings. Now, you only want those items which are integers. Now, you know that a very small number of them won't be integers (in string form). So should you check to see if every string is an integer before trying to convert them? Or should you just try to convert them and throw and catch an exception if you get one that isn't an integer? The second way is more efficient, but if they were mostly non-integers then it would be more efficient to use an if-statement.
Most of the time, however, you should not use exceptions if you can replace them with a conditional.
As someone has already said, 'Exceptions' in programming languages are for exceptional cases and not to set logical flow of your program. For example, in the case of given code snippet of your question, you have to see what the enclosing method's or function's intention is. Is checking array.length < countNum part of the business logic or not. If yes, then putting a pair of if/else there is the way to go. If that condition is not part of the business logic and the enclosing method's intention is something else, then write code for that something else and throw exception instead of going the if/else way. For example you develop an application for a school and in your application you have a method GetClassTopperGrades which is responsible for the business logic part which requires to return the highest marks of the student in a certain class. the method/function definition would be something like this:
int GetClassTopperGrades(string classID)
In this case the method's intention is to return the grades, for a valid class, which will always be a positive integer, according to the business logic of the application. Now if someone calls your method and passes a garbage string or null, what should it do? If should throw an exception e.g. ArgumentException or 'ArgumentNullException' because this was an exceptional case in this particular context. The method assumed that always a valid class ID will be passed and NULL or empty string is NOT a valid class ID (a deviation from the business logic).
Apart from that, in some conditions there is no prior knowledge about the outcome of a given code and no defined way to prevent an exceptional situation. For example, querying some remote database, if the network goes down, you don't have any other option there apart from throwing an exception. Would you check network connectivity before issuing every SQL query to the remote database?
There is strong and indisputable reason why to use exceptions - no matter of language. I strongly believe that decision about if to use exceptions or not have nothing to do with particular language used.
Using exceptions is universal method to notify other part of code that something wrong happened in kind of loosely coupled way. Let imagine that if you would like to handle some exceptional condition by using if.. nad else.. you need to insert into different part of your code some arbitrary variables and other stuff which probably would easily led to have spaghetti code soon after.
Let next imagine that you are using any external library/package and it's author decided to put in his/her code other arbitrary way to handle wrong states - it would force you to adjust to its way of dealing with it - for example you would need to check if particular methods returns true or false or whatever. Using exceptions makes handling errors much more easy - you just assume that if something goes wrong - the other code will throw exception, so you just wrap the code in try block and handle possible exception on your own way.

assert() vs enforce(): Which to choose?

I'm having a hard time choosing whether I should "enforce" a condition or "assert" a condition in D. (This is language-neutral, though.)
Theoretically, I know that you use assertions to find bugs, and you enforce other conditions in order to check for atypical conditions. E.g. you might say assert(count >= 0) for an argument to your method, because that indicates that there's a bug with the caller, and that you would say enforce(isNetworkConnected), because that's not a bug, it's just something that you're assuming that could very well not be true in a legitimate situation beyond your control.
Furthermore, assertions can be removed from code as an optimization, with no side effects, but enforcements cannot be removed because they must always execute their condition code. Hence if I'm implementing a lazy-filled container that fills itself on the first access to any of its methods, I say enforce(!empty()) instead of assert(!empty()), because the check for empty() must always occur, since it lazily executes code inside.
So I think I know that they're supposed to mean. But theory is easier than practice, and I'm having a hard time actually applying the concepts.
Consider the following:
I'm making a range (similar to an iterator) that iterates over two other ranges, and adds the results. (For functional programmers: I'm aware that I can use map!("a + b") instead, but I'm ignoring that for now, since it doesn't illustrate the question.) So I have code that looks like this in pseudocode:
void add(Range range1, Range range2)
{
Range result;
while (!range1.empty)
{
assert(!range2.empty); //Should this be an assertion or enforcement?
result += range1.front + range2.front;
range1.popFront();
range2.popFront();
}
}
Should that be an assertion or an enforcement? (Is it the caller's fault that the ranges don't empty at the same time? It might not have control of where the range came from -- it could've come from a user -- but then again, it still looks like a bug, doesn't it?)
Or here's another pseudocode example:
uint getFileSize(string path)
{
HANDLE hFile = CreateFile(path, ...);
assert(hFile != INVALID_HANDLE_VALUE); //Assertion or enforcement?
return GetFileSize(hFile); //and close the handle, obviously
}
...
Should this be an assertion or an enforcement? The path might come from a user -- so it might not be a bug -- but it's still a precondition of this method that the path should be valid. Do I assert or enforce?
Thanks!
I'm not sure it is entirely language-neutral. No language that I use has enforce(), and if I encountered one that did then I would want to use assert and enforce in the ways they were intended, which might be idiomatic to that language.
For instance assert in C or C++ stops the program when it fails, it doesn't throw an exception, so its usage may not be the same as what you're talking about. You don't use assert in C++ unless you think that either the caller has already made an error so grave that they can't be relied on to clean up (e.g. passing in a negative count), or else some other code elsewhere has made an error so grave that the program should be considered to be in an undefined state (e.g. your data structure appears corrupt). C++ does distinguish between runtime errors and logic errors, though, which may roughly correspond but I think are mostly about avoidable vs. unavoidable errors.
In the case of add you'd use a logic error if the author's intent is that a program which provides mismatched lists has bugs and needs fixing, or a runtime exception if it's just one of those things that might happen. For instance if your function were to handle arbitrary generators, that don't necessarily have a means of reporting their length short of destructively evaluating the whole sequence, you'd be more likely consider it an unavoidable error condition.
Calling it a logic error implies that it's the caller's responsibility to check the length before calling add, if they can't ensure it by the exercise of pure reason. So they would not be passing in a list from a user without explicitly checking the length first, and in all honesty should count themselves lucky they even got an exception rather than undefined behavior.
Calling it a runtime error expresses that it's "reasonable" (if abnormal) to pass in lists of different lengths, with the exception indicating that it happened on this occasion. Hence I think an enforcement rather than an assertion.
In the case of filesize: for the existence of a file, you should if possible treat that as a potentially recoverable failure (enforcement), not a bug (assertion). The reason is simply that there is no way for the caller to be certain that a file exists - there's always someone with more privileges who can come along and remove it, or unmount the entire fielsystem, in between a check for existence and a call to filesize. It's therefore not necessarily a logical flaw in the calling code when it doesn't exist (although the end-user might have shot themselves in the foot). Because of that fact it's likely there will be callers who can treat it as just one of those things that happens, an unavoidable error condition. Creating a file handle could also fail for out-of-memory, which is another unavoidable error on most systems, although not necessarily a recoverable one if for example over-committing is enabled.
Another example to consider is operator[] vs. at() for C++'s vector. at() throws out_of_range, a logic error, not because it's inconceivable that a caller might want to recover, or because you have to be some kind of numbskull to make the mistake of accessing an array out of range using at(), but because the error is entirely avoidable if the caller wants it to be - you can always check the size() before access if you have no other way of knowing whether your index is good or not. And so operator[] doesn't guarantee any checks at all, and in the name of efficiency an out of range access has undefined behavior.
assert should be considered a "run-time checked comment" indicating an assumption that the programmer makes at that moment. The assert is part of the function implementation. A failed assert should always be considered a bug at the point where the wrong assumption is made, so at the code location of the assert. To fix the bug, use a proper means to avoid the situation.
The proper means to avoid bad function inputs are contracts, so the example function should have a input contract that checks that range2 is at least as long as range1. The assertion inside the implementation could then still remain in place. Especially in longer more complex implementations, such an assert may inprove understandability.
An enforce is a lazy approach to throwing runtime exceptions. It is nice for quick-and-dirty code because it is better to have a check in there rather then silently ignoring the possibility of a bad condition. For production code, it should be replaced by a proper mechanism that throws a more meaningful exception.
I believe you have partly answered your question yourself. Assertions are bound to break the flow. If your assertion is wrong, you will not agree to continue with anything. If you enforce something you are making a decision to allow something to happen based on the situation. If you find that the conditions are not met, you can enforce that the entry to a particular section is denied.

Programming style: should you return early if a guard condition is not satisfied?

One thing I've sometimes wondered is which is the better style out of the two shown below (if any)? Is it better to return immediately if a guard condition hasn't been satisfied, or should you only do the other stuff if the guard condition is satisfied?
For the sake of argument, please assume that the guard condition is a simple test that returns a boolean, such as checking to see if an element is in a collection, rather than something that might affect the control flow by throwing an exception. Also assume that methods/functions are short enough not to require editor scrolling.
// Style 1
public SomeType aMethod() {
SomeType result = null;
if (!guardCondition()) {
return result;
}
doStuffToResult(result);
doMoreStuffToResult(result);
return result;
}
// Style 2
public SomeType aMethod() {
SomeType result = null;
if (guardCondition()) {
doStuffToResult(result);
doMoreStuffToResult(result);
}
return result;
}
I prefer the first style, except that I wouldn't create a variable when there is no need for it. I'd do this:
// Style 3
public SomeType aMethod() {
if (!guardCondition()) {
return null;
}
SomeType result = new SomeType();
doStuffToResult(result);
doMoreStuffToResult(result);
return result;
}
Having been trained in Jackson Structured Programming in the late '80s, my ingrained philosophy was always "a function should have a single entry-point and a single exit-point"; this meant I wrote code according to Style 2.
In the last few years I have come to realise that code written in this style is often overcomplex and hard to read/maintain, and I have switched to Style 1.
Who says old dogs can't learn new tricks? ;)
Style 1 is what the Linux kernel indirectly recommends.
From https://www.kernel.org/doc/Documentation/process/coding-style.rst, chapter 1:
Now, some people will claim that having 8-character indentations makes
the code move too far to the right, and makes it hard to read on a
80-character terminal screen. The answer to that is that if you need
more than 3 levels of indentation, you're screwed anyway, and should fix
your program.
Style 2 adds levels of indentation, ergo, it is discouraged.
Personally, I like style 1 as well. Style 2 makes it harder to match up closing braces in functions that have several guard tests.
I don't know if guard is the right word here. Normally an unsatisfied guard results in an exception or assertion.
But beside this I'd go with style 1, because it keeps the code cleaner in my opinion. You have a simple example with only one condition. But what happens with many conditions and style 2? It leads to a lot of nested ifs or huge if-conditions (with || , &&). I think it is better to return from a method as soon as you know that you can.
But this is certainly very subjective ^^
Martin Fowler refers to this refactoring as :
"Replace Nested Conditional with Guard Clauses"
If/else statements also brings cyclomatic complexity. Hence harder to test cases. In order to test all the if/else blocks you might need to input lots of options.
Where as if there are any guard clauses, you can test them first, and deal with the real logic inside the if/else clauses in a clearer fashion.
If you dig through the .net-Framework using .net-Reflector you will see the .net programmers use style 1 (or maybe style 3 already mentioned by unbeli).
The reasons are already mentioned by the answers above. and maybe one other reason is to make the code better readable, concise and clear.
the most thing this style is used is when checking the input parameters, you always have to do this if you program a kind of frawework/library/dll.
first check all input parameters than work with them.
It sometimes depends on the language and what kinds of "resources" that you are using (e.g. open file handles).
In C, Style 2 is definitely safer and more convenient because a function has to close and/or release any resources that it obtained during execution. This includes allocated memory blocks, file handles, handles to operating system resources such as threads or drawing contexts, locks on mutexes, and any number of other things. Delaying the return until the very end or otherwise restricting the number of exits from a function allows the programmer to more easily ensure that s/he properly cleans up, helping to prevent memory leaks, handle leaks, deadlock, and other problems.
In C++ using RAII-style programming, both styles are equally safe, so you can pick one that is more convenient. Personally I use Style 1 with RAII-style C++. C++ without RAII is like C, so, again, Style 2 is probably better in that case.
In languages like Java with garbage collection, the runtime helps smooth over the differences between the two styles because it cleans up after itself. However, there can be subtle issues with these languages, too, if you don't explicitly "close" some types of objects. For example, if you construct a new java.io.FileOutputStream and do not close it before returning, then the associated operating system handle will remain open until the runtime garbage collects the FileOutputStream instance that has fallen out of scope. This could mean that another process or thread that needs to open the file for writing may be unable to until the FileOutputStream instance is collected.
Although it goes against best practices that I have been taught I find it much better to reduce the nesting of if statements when I have a condition such as this. I think it is much easier to read and although it exits in more than one place it is still very easy to debug.
I would say that Style1 became more used because is the best practice if you combine it with small methods.
Style2 look a better solution when you have big methods. When you have them ... you have some common code that you want to execute no matter how you exit. But the proper solution is not to force a single exit point but to make the methods smaller.
For example if you want to extract a sequence of code from a big method, and this method has two exit points you start to have problems, is hard to do it automatically. When i have a big method written in style1 i usually transform it in style2, then i extract methods then in each of them i should have Style1 code.
So Style1 is best but is compatible with small methods.
Style2 is not so good but is recommended if you have big methods that you don't want, have time to split.
I prefer to use method #1 myself, it is logically easier to read and also logically more similar to what we are trying to do. (if something bad happens, exit function NOW, do not pass go, do not collect $200)
Furthermore, most of the time you would want to return a value that is not a logically possible result (ie -1) to indicate to the user who called the function that the function failed to execute properly and to take appropriate action. This lends itself better to method #1 as well.
I would say "It depends on..."
In situations where I have to perform a cleanup sequence with more than 2 or 3 lines before leaving a function/method I would prefer style 2 because the cleanup sequence has to be written and modified only once. That means maintainability is easier.
In all other cases I would prefer style 1.
Number 1 is typically the easy, lazy and sloppy way. Number 2 expresses the logic cleanly. What others have pointed out is that yes it can become cumbersome. This tendency though has an important benefit. Style #1 can hide that your function is probably doing too much. It doesn't visually demonstrate the complexity of what's going on very well. I.e. it prevents the code from saying to you "hey this is getting a bit too complex for this one function". It also makes it a bit easier for other developers that don't know your code to miss those returns sprinkled here and there, at first glance anyway.
So let the code speak. When you see long conditions appearing or nested if statements it is saying that maybe it would be better to break this stuff up into multiple functions or that it needs to be rewritten more elegantly.

Should functions that only output return anything?

I'm rewriting a series of PHP functions to a container class. Many of these functions do a bit of processing, but in the end, just echo content to STDOUT.
My question is: should I have a return value within these functions? Is there a "best practice" as far as this is concerned?
In systems that report errors primarily through exceptions, don't return a return value if there isn't a natural one.
In systems that use return values to indicate errors, it's useful to have all functions return the error code. That way, a user can simply assume that every single function returns an error code and develop a pattern to check them that they follow everywhere. Even if the function can never fail right now, return a success code. That way if a future change makes it possible to have an error, users will already be checking errors instead of implicitly silently ignoring them (and getting really confused why the system is behaving oddly).
Can the processing fail? If so, should the caller know about that? If either of these is no, then I don't see value in a return. However, if the processing can fail, and that can make a difference to the caller, then I'd suggest returning a status or error code.
Do not return a value if there is no value to return. If you have some value you need to convey to the caller, then return it but that doesn't sound like the case in this instance.
I will often "return: true;" in these cases, as it provides a way to check that the function worked. Not sure about best practice though.
Note that in C/C++, the output functions (including printf()) return the number of bytes written, or -1 if this fails. It may be worth investigating this further to see why it's been done like this. I confess that
I'm not sure that writing to stdout could practically fail (unless you actively close your STDOUT stream)
I've never seen anyone collect this value, let alone do anything with it.
Note that this is distinct from writing to file streams - I'm not counting stream redirection in the shell.
To do the "correct" thing, if the point of the method is only to print the data, then it shouldn't return anything.
In practice, I often find that having such functions return the text that they've just printed can often be useful (sometimes you also want to send an error message via email or feed it to some other function).
In the end, the choice is yours. I'd say it depends on how much of a "purist" you are about such things.
You should just:
return;
In my opinion the SRP (single responsibility principle) is applicable for methods/functions as well, and not only for objects. One method should do one thing, if it outputs data it shouldn't do any data processing - if it doesn't do processing it shouldn't return data.
There is no need to return anything, or indeed to have a return statement. It's effectively a void function, and it's comprehensible enough that these have no return value. Putting in a 'return;' solely to have a return statement is noise for the sake of pedantry.

Should you check for wrong parameter values in the constructor?

Do you check for data validity in every constructor, or do you just assume the data is correct and throw exceptions in the specific function that has a problem with the parameter?
A constructor is a function too - why differentiate?
Creating an object implies that all the integrity checks have been done. It's perfectly reasonable to check parameters in a constructor and throw an exception once an illegal value has been detected.
Among all this simplifies debugging. When your program throws exception in a constructor you can observe a stack trace and often immediately see the cause. If you delay the check you'll then have to do more investigation to detect what earlier event causes the current error.
It's always better to have a fully-constructed object with all the invariants "satisfied" from the very beginning. Beware, however, of throwing exceptions from constructor in non-managed languages since that may result in a memory leak.
I always coerce values in constructors. If the users can't be bothered to follow the rules, I just silently enforce them without telling them.
So, if they pass a value of 107%, I'll set it to 100%. I just make it clear in the documentation that that's what happens.
Only if there's no obvious logical coercion do I throw an exception back to them. I like to call this the "principal of most astonishment to those too lazy or stupid to read the documentation".
If you throw in the constructor, the stack trace is more likely to show where the wrong values are coming from.
I agree with sharptooth in the general case, but there are sometimes objects that can have states in which some functions are valid and some are not. In these situations it's better to defer checks to the functions with these particular dependencies.
Usually you'll want to validate in a constructor, if only because that means your objects will always be in a valid state. But some kinds of objects need function-specific checks, and that's OK too.
This is a difficult question. I have changed my habit related to this question several times in the last years, so here are some thoughts coming to my mind :
Checking the arguments in the constructor is like checking the arguments for a traditional method... so I would not differentiate here...
Checking the arguments of a method of course does help to ensure parameters passed are correct, but also introduces a lot of more code you have to write, which not always pays off in my opinion... Especially when you do write short methods which delegate quite frequently to other methods, I guess you would end up with 50 % of your code just doing parameter checks...
Furthermore consider that very often when you write a unit test for your class, you don't want to have to pass in valid parameters for all methods... Quite often it does make sense to only pass those parameters important for the current test case, so having checks would slow you down in writing unit tests...
I think this really depends on the situation... what I ended up is to only write parameter checks when I know that the parameters do come from an external system (like user input, databases, web services, or something like that...). If data comes from my own system, I don't write tests most of the time.
I always try to fail as early as possible. So I definitively check the parameters in the constructor.
In theory, the calling code should always ensure that preconditions are met, before calling a function. The same goes for constructors.
In practice, programmers are lazy, and to be sure it's best to still check preconditions. Asserts come in handy there.
Example. Excuse my non-curly brace syntax:
// precondition b<>0
function divide(a,b:double):double;
begin
assert(b<>0); // in case of a programming error.
result := a / b;
end;
// calling code should be:
if foo<>0 then
bar := divide(1,0)
else
// do whatever you need to do when foo equals 0
Alternatively, you can always change the preconditions. In case of a constructor this is not really handy.
// no preconditions.. still need to check the result
function divide(a,b:double; out hasResult:boolean):double;
begin
hasResult := b<>0;
if not hasResult then
Exit;
result := a / b;
end;
// calling code:
bar := divide(1,0,result);
if not result then
// do whatever you need to do when the division failed
Always fail as soon as possible. A good example of this practice is exhibited by the runtime ..
* if you try and use an invalid index for an array you get an exception.
* casting fails immediately when attempted not at some later stage.
Imagine what a disaster code be if a bad cast remained in a reference and everytime you tried to use that you got an exception. The only sensible approach is to fail as soon as possible and try and avoid bad / invalid state asap.