Encapsulation and Exceptions - exception

I'm designing my own OO language and was happily going along until I hit exceptions. It seems to me that exceptions break encapsulation.
For example, if class A has an object of class B, B has C, and C has X, which throws an exception to A, the code in A must not only know about X but also B and C to handle it correctly. You can tell this because if you replace C with D, A exception's handler will have to change to extract the relevant information from the call stack.
The only way I can think of around this problem is to have exceptions as part of the class' API so that they propagate back up the calling stack one caller at a time. And they should re-interrupt the exception in their own terms.
Here's an example. Trend is a class for analyzing statistical trends and it has a method, slope, for calculating the slope of a line from two points.
method slope
given
Point 1st
Point 2nd
returns
Number m
except
when infinite slope
m gets
( 2nd's y - 1st's y ) / ( 2nd's x - 1st's x )
except
when any divide by zero
declare infinite slope
when overflow of ( 2nd's y - 1st's y )
declare infinite slope
when overflow of ( 2nd's x - 1st's x )
instead do m gets 0
when overflow of ( 2nd's y - 1st's y ) / ( 2nd's x - 1st's x )
declare infinite slope
when any underflow
instead use 0
end of method slope
Is there a better way to do this?

Real-world exceptions fall into one of three rough categories:
System exceptions are thrown when fatal errors in the underlying runtime occur, things like out of memory, stack overflow, or perhaps security violations. These generally shouldn't be caught, and in many cases can't be caught. They basically use the exception system in order to take down the system (relatively) gracefully and report the final stack trace to the developer.
Programmatic exceptions are thrown when the code is doing something incorrect. Things like invalid arguments, calling methods on null pointers, out of bounds errors. These shouldn't be caught: instead the code that causes them to be thrown should be fixed. They use the exception system to avoid having a more or less redundant assert system that does most of the same things.
Runtime exceptions are thrown when something bad happens that cannot be programmatically prevented. They are thrown when an unexpected circumstance has occurred, but its rare enough that you don't want to clutter up the main API with error codes, etc. These are things like file IO errors, network problems, parse errors, etc. These exceptions will usually be specific to the API that's throwing them. These are the only kind of exceptions that should be typically caught.
If you agree with the above breakdown, then encapsulation only effects the third category. For the other two, the exception isn't caught in code, so it doesn't matter. For the third kind of exception, yes, I think you generally should catch the exception and translate it to different exception that's appropriate the layer that's rethrowing it. This is why most exception systems in other languages support something like "InnerException" where you can attach the previous lower-level exception that led to the current one.

if you replace C with D, A exception's
handler will have to change to extract the relevant information from
the call stack.
Not the way exception handlers are usually implemented. The intermediate called classes may not even exist when A is compiled, so building that dependency into A is not only hard but in principle infeasible.

Im agree that Exceptions sounds more Structured that Object-and-Class-Oriented.
However, allow to solve dealing with errors.
For example, ...
... to extract the relevant information from the call stack.
Your example in C++ -like pseudocode:
class XClass {
public:
void doSomething() { throw new MyException(); }
}
class CClass {
public:
XClass X;
}
class BClass {
public:
CClass C;
}
class AClass {
public:
BClass B;
}
void main()
{
AClass A = new AClass ();
// exception here:
A.B.C.X.doSomething();
}
"A" object doesn't have to know about "X", or viceversa, regarding exceptions. There should be an exception stack, which, all objects add or remove execution data.
"pure C" doesn't have exceptions, but, they can be emulated. You may want to search for the "setjmp" library, and see can those functions interact with the stack. Those knowledge may help you implement exceptions:
http://www.cplusplus.com/reference/clibrary/csetjmp/
P.D. Offtopic suggestion: add c++ namespaces, or pascal modules, to your progr. lang., you won't regret it.

Related

Are red flag methods bad?

In Squeak System Browser some messages have a red flag on left side.
I saw from the Balloon message that it is because I have an interruption in the code, that is a self halt: 'foobar' or a self error:'foobar'.
Is this so bad ? How can avoid it ? I put the error message where something particularly bad happened and going further makes no much sense. Like a failure in authentication, a failure on getting data from the networks and so on.
I would like to do something like rising an exception in these particular occasions, but, if possible, I don't want to see red flags on half of my methods.
Is there a standard practice to do it ?
Halts
A #halt is something you use for debugging purposes. In most of the cases, you insert a #halt when you want to reach a point in the execution flow and continue from there using the debugger, stepping and inspecting the involved objects. You can also want to add a #halt to see whether certain method gets invoked or not, so to better understand what's actually happening when you evaluate some expression. In all these cases the #halt should be removed as soon as your debugging finished.
As a typical example, imagine you are debugging an algorithm and you need to better understand why it fails. Then you insert a #halt:
computeDiagonal: k
| product akk diff |
product := self dotProductLimitedTo: k withRow: k.
akk := matrix atRow: k column: k.
diff := akk - product.
diff < 0.0 ifTrue: [
state := #fail.
^self halt]. "wait a minute!"
lower atRow: k column: k put: diff sqrt
Assertions
There are cases, however, where your investigation wasn't conclusive enough, or the issue you are analyzing is not reproducible. It would then be a good idea to leave some longer-term indication that something should not be happening, or if it does, offer an opportunity to better understand its cause. In these situations a halt could work but may not be expressive enough (you are no longer immersed in the original problem) so you might want to consider and #assert: or #deny: instead. These messages, which are usually sent in unit tests, can also be present in any method and will convey a clearer intention.
Note that the decision to use #halt or #assert: doesn't depend on the method, but on the state of maturity of your model. For instance, if you aren't quite sure the algorithm won't fail again, but you cannot reproduce a failure, you should replace the #halt with an assert: or #deny:
computeDiagonal: k
| product akk diff |
product := self dotProductLimitedTo: k withRow: k.
akk := matrix atRow: k column: k.
diff := akk - product.
self deny: diff < 0.0. "got you!"
lower atRow: k column: k put: diff sqrt
Errors
Finally, if you are pretty sure something should/shouldn't happen, #error:: is your best choice. The difference between halt, assert: and error: is that the latter is for end users while the others are for developers.
computeDiagonal: k
| product akk diff |
product := self dotProductLimitedTo: k withRow: k.
akk := matrix atRow: k column: k.
diff := akk - product.
diff < 0.0 ifTrue: [self error: 'Cholesky decomposition failed']. "Oh oh..."
lower atRow: k column: k put: diff sqrt
Of course, to take full advantage of the Exception framework, you might want to consider adding your own version of the #error: message, so that it would signal a specific subclass of Exception, rather than the generic one. There are plenty of examples in the system for you to get inspiration. This is not always necessary (or good), it is just something to think about.
Note also that an Error may be resumable, so do not associate them with aborting strategies. In fact, #halt and #assert: do signal resumable exceptions.
Conclusion
The debugger is your best friend, and the the #halt message will bring it anywhere in your code. However, leaving a #halt in code that has been published will be interpreted as an indication of unfinished work.
Assertions may help other developers to better understand how to use your objects. But please, resist the temptation of being too assertive.
Errors are an elegant way of declaring unexpected behavior in a way that would allow the developer (you) to have a clue on what might have happened. Don't think of errors as text messages, errors in Smalltalk are first class objects that may contain valuable information.
As you have figured out the red flag means there is some kind of halt in the message. This is fine if you have a development code where you need to halt the execution to check the state.
That being said such code does not belong to a production code. This should be replaced by exceptions.
In squeak the following exceptions (ANSI-Compatible Exceptions):
Evaluating Blocks with Exceptions
Methods for handling Exceptions raised in a BlockContext
Message: ensure: aTerminationBlock
Description: Evaluate
aTerminationBlock after evaluating the receiver, regardless of whether
the receiver's evaluation completes.
Message: ifCurtailed: aTerminationBlock
Description: Evaluate the
receiver. If it terminates abnormally, evaluate aTerminationBlock.
Message: on: exception do: handlerActionBlock
Description:
Evaluate the receiver in the scope of an exception handler,
handlerActionBlock.
Examples
["target code, which may abort"]
ensure:
["code that will always be executed
after the target code,
whatever whatever may happen"]
["target code, which may abort"]
ifCurtailed:
["code that will be executed
whenever the target code terminates
without a normal return"]
["target code, which may abort"]
on: Exception
do: [:exception |
"code that will be executed whenever
the identified Exception is signaled."]
The source of the information is Squeak Smalltalk: Classes Reference.

Name for Effect-like Systems that only annotate function types

Some languages have ways of tracking properties like purity or the presence/absence of exceptions at compile time. The fact that they do this sort of resembles an effect system.
There seem to be two broad categories of these effect-like systems, one in which any value can be "wrapped in an effect" like the IO Monad in Haskell and one in which only functions can be annotated (like noexcept in C++ or checked exceptions in Java). constexpr in C++ is a weird case that I don't know how to think about and am intentionally ignoring, since it means very different things when applied to a function and a non-function value.
I'm wondering what you call the Haskell-IO-Monad style of effect tracking vs the checked-exception style. It seems like the only reason you would use the latter is backwards compatibility with a language that doesn't track the effect you're interested in.
More explicitly, Haskell tracks effects at the type level, via the IO Monad.
a -> b is the type of a pure function with no side effects.
Impure functions return an IO value, which has a special status within the runtime / semantics of Haskell.
The nearest equivalent to an impure function in other languages would be something like
a -> IO b or, said differently, a computation producing a b parameterized by a. The type constructor -> still has the same meaning of pure function that it did before.
C++ has a distinction between functions that can potentially throw exceptions (the default) and functions that can't.
int add(int x, int y) noexcept {
return x + y;
}
However it isn't possible to mark a non-function as noexcept.
// BAD!
int x noexcept = <expr>;
The noexcept applies to the function itself, in effect giving you a different type constructor than ->.

Is divide by zero an error or an exception?

Basically I want to know how do you differentiate an error from an exception. In some programming languages accessing a non existent file throws an error and in others its an exception. How do you know if some thing is an error or an exception?
Like anything else - you either test it or read the documentation. It can be an "Error" or an "Exception" based on the language.
Eg.
C:
Crashes and gives a divide by zero error.
Ruby:
>> 6 / 0
ZeroDivisionError: divided by 0
from (irb):1:in `/'
from (irb):1
(ZeroDivisionError is actually an exception.)
Java:
Code:
int x = 6 / 0;
Output:
Exception in thread "main" java.lang.ArithmeticException: / by zero
It depends on the language :
some languages don't have exceptions
some languages don't use exceptions for everything.
For example, in PHP :
There are exceptions
But divide by 0 doesn't cause an exception to be thrown : is only raises a warning -- that doesn't stop the execution of the script.
The following portion of code :
echo 10 / 0;
echo "hello, world!";
Would give this result :
Warning: Division by zero in /.../temp.php on line 5
hello, world!
The terms error and exception are commonly used as jargon terms, with meanings that vary depending upon the programming ecosystem in which they are used.
Conditions
This response follows the lead of Common Lisp, and adopts the term condition as a nonjudgmental way of referring to an "interesting situation" in a program.
What makes a program condition "interesting"? Let's consider the division-by-zero case for real numbers. In the overwhelming majority of cases in which one real is divided by another, the result is another plain ordinary well-behaved real number. These are the "routine" or "uninteresting" cases. However, in the case that the divisor is zero then, mathematically speaking, the result is undefined. The program is now in an "interesting" or "exceptional" condition.
It becomes even more complicated once we take the mathematical ideal of a real number and model it, say, as an IEEE-format floating point number. If we divide 1.0 / 0.0, the IEEE standard (mostly) says that the result is in fact another floating point number, the quiet NaN Infinity. Since the result no longer behaves in the same way as a plain old real number, the program condition is once again "interesting" or "exceptional".
Classifying Conditions
The question is: what should we do when we run into an interesting condition? The answer is dependent upon the context. When classifying program conditions, the following questions are useful:
How likely is it that the condition will occur: certain, probable, unlikely, impossible?
How is the condition detected: program malfunction, distinguished value, signal/handler (aka exception handling), program termination?
How should the condition be handled: ignore it, perform some special action, terminate the program?
The answers to these questions yield 4 x 4 x 3 = 48 distinct cases -- and surely more could be distinguished by further criteria. This brings us to the heart of the matter. We have more than two cases but only two labels, error and exception, to apply to them. Needless to say, there are many possible ways to divide the 48+ cases into two groups.
For example, one could say that anything involving program malfunction is an error, anything else is an exception. Or that anything involving a language's built-in exception handling facilities is an exception, anything else is an error. The possibilities are legion.
Examples
End-Of-File
When reading and processing a stream of characters, hitting the end-of-file is certain. In C, this event is detected by means of a distinguished return value from an I/O function, a so-called error return value. Thus, one speaks of an EOF error.
Division-By-Zero
When dividing two user-entered numbers in a simple calculator program, we want to give a meaningful result even if the user enters a divisor of zero. In some C environments, division-by-zero results in a signal (SIGFPE) that must be fielded by a signal handler. Signals are sometimes called exceptions in the C community and, confusingly, sometimes called program error signals. In other C environments, IEEE floating-point rules apply and the division-by-zero would result in a NaN value. The C environment would be blissfully unaware of that value, considering it to be neither an exception nor an error.
Runtime Load Failure
Programs frequently load their program code dynamically at run-time (e.g. classes, DLLs). This might fail due to a missing file. C offers no standard way to detect or recover from this case. The program would be terminated involuntarily, and one often speaks of this situation as a fatal exception. In Java, this would be termed a linkage error.
Java's Throwable Hierarchy
Java's exception-handling system divides the so-called Throwable class hierarchy into two main groups. Subclasses of Error are meant to represent conditions from which recovery is impossible. Subclasses of Exception are meant for recoverable conditions are are further subdivided into checked exceptions (for probable conditions) and unchecked exceptions (for unlikely conditions). Unfortunately, the boundaries between these categories are poorly defined and you will often find instances of throwables whose semantics suggest that they belong in a different category.
Be Wary Of Jargon
These examples show that the meanings of error and exception are murky at best. One must treat error and exception as jargon, whose meaning is determined by the context of discussion.
Of greater value are distinguishing characteristics of program conditions. What is the likelihood of the condition occurring? How is the condition detected? What action should be taken when the condition is detected? In any discussion that demands clarity, one is better suited to answer these questions directly rather than relying upon jargon terminology.
Exceptions should indicate exceptional activity, so if you reach a point in your code for which you've done your best to avoid divide by zero, then throwing an exception (if you are able to in your language) is the right way.
If it's routine logic to check for divide by zero (like for a calculator app) then you should check for that in your code before it has the chance to raise an exception. In that case, it's an error (in user input) and should be handled as such.
(Stole this idea either from The Pragmatic Programmer or Code Complete; can't remember which.)

Implementation techniques for FSM states

How do you go about implementing FSM(EDIT:Finite State Machine) states?
I usually think about an FSM like a set of functions,
a dispatcher,
and a thread as to indicate the current running state.
Meaning, I do blocking calls to functions/functors representing
states.
Just now I have implemented one in a different style,
where I still represent states with function(object)s, but the thread
just calls a state->step() method, which tries to return
as quickly as possible. In case the state has finished and a
transition should take place, it indicates that accordingly.
I would call this the 'polling' style since the functions mostly look
like:
void step()
{
if(!HaveReachedGoal)
{
doWhateverNecessary();
return; // get out as fast as possible
}
// ... test perhaps some more subgoals
indicateTransition();
}
I am aware that it is an FSM within an FSM.
It feels rather simplistic, but it has certain advantages.
While a thread being blocked, or held in some kind of
while (!CanGoForward)checkGoForward();
loop can be cumbersome and unwieldy,
the polling felt much easier to debug.
That's because the FSM object regains control after
every step, and putting out some debug info is a breeze.
Well I am deviating from my question:
How do you implement states of FSMs?
The state Design Pattern is an interesting way of implementing a FSM:
http://en.wikipedia.org/wiki/State_pattern
It's a very clean way of implementing the FSM but it can be messy depending on the complexity of your FSM (but not the amount of states). However, the advantages are that:
you eliminate code duplication (especially if/else statements)
It is easier to extend with new states
Your classes have better cohesion so all related logic is in one place - this should also make your code easier to writ tests for.
There is a Java and C++ implementation at this site:
http://www.vincehuston.org/dp/state.html
There’s always what I call the Flying Spaghetti Monster’s style of implementing FSMs (FSM-style FSMs): using lotsa gotos. For example:
state1:
do_something();
goto state2;
state2:
if (condition) goto state1;
else goto state3;
state3:
accept;
Very nice spaghetti code :-)
I did it as a table, a flat array in the memory, each cell is a state. Please have a look at the cvs source of the abandoned DFA project. For example:
class DFA {
DFA();
DFA(int mychar_groups,int mycharmap[256],int myi_state);
~DFA();
void add_trans(unsigned int from,char sym,unsigned int to);
void add_trans(unsigned int from,unsigned int symn,unsigned int to);
/*adds a transition between state from to state to*/
int add_state(bool accepting=false);
int to(int state, int symn);
int to(int state, char sym);
void set_char(char used_chars[],int);
void set_char(set<char> char_set);
vector<int > table; /*contains the table of the dfa itself*/
void normalize();
vector<unsigned int> char_map;
unsigned int char_groups; /*number of characters the DFA uses,
char_groups=0 means 1 character group is used*/
unsigned int i_state; /*initial state of the DFA*/
void switch_table_state(int first,int sec);
unsigned int num_states;
set<int > accepting_states;
};
But this was for a very specific need (matching regular expressions)
I remember my first FSM program. I wrote it in C with a very simple switch statement. Switching from one state to another or following through to the next state seemed natural.
Then I progressed to use a table lookup approach. I was able to write some very generic coding style using this approach. However, I was caught out a couple of times when the requirements changed and I have to support some extra events.
I have not written any FSMs lately. The last one I wrote was for a comms module in C++ where I used a "state design pattern" in conjunction with a "command pattern" (action).
If you are creating a complex state machine then you may want to check out SMC - the State Machine Compiler. This takes a textual representation of a state machine and compiles it into the language of your choice - it supports Java, C, C++, C#, Python, Ruby, Scala and many others.

Is there really a performance hit when catching exceptions?

I asked a question about exceptions and I am getting VERY annoyed at people saying throwing is slow. I asked in the past How exceptions work behind the scenes and I know in the normal code path there are no extra instructions (as the accepted answer says) but I am not entirely convinced throwing is more expensive then checking return values. Consider the following:
{
int ret = func();
if (ret == 1)
return;
if (ret == 2)
return;
doSomething();
}
vs
{
try{
func();
doSomething();
}
catch (SpecificException1 e)
{
}
catch (SpecificException2 e)
{
}
}
As far as I know there isn't a difference except the ifs are moved out of the normal code path into an exception path and an extra jump or two to get to the exception code path. An extra jump or two doesn't sound like much when it reduces a few ifs in your main and more often run) code path. So are exceptions actually slow? Or is this a myth or an old issue with old compilers?
(I'm talking about exceptions in general. Specifically, exceptions in compiled languages like C++ and D; though C# was also in my mind.)
Okay - I just ran a little test to make sure that exceptions are actually slower. Summary: On my machine a call w/ return is 30 cycles per iteration. A throw w/ catch is 20370 cycles per iteration.
So to answer the question - yes - throwing exceptions is slow.
Here's the test code:
#include <stdio.h>
#include <intrin.h>
int Test1()
{
throw 1;
// return 1;
}
int main(int argc, char*argv[])
{
int result = 0;
__int64 time = 0xFFFFFFFF;
for(int i=0; i<10000; i++)
{
__int64 start = __rdtsc();
try
{
result += Test1();
}
catch(int x)
{
result += x;
}
__int64 end = __rdtsc();
if(time > end - start)
time = end - start;
}
printf("%d\n", result);
printf("time: %I64d\n", time);
}
alternative try/catch written by op
try
{
if(Test1()!=0)
result++;
}
catch(int x)
{
result++;
I don't know exactly how slow it is, but throwing an exception that already exists (say it was created by the CLR) is not much slower, cause you've already incurred the hit of constructing the exception. ... I believe it's the construction of an exception that creates the majority of the addtional performance hit ... Think about it, it has to create a stack trace, (including reading debug symbols to add lines numbers and stuff) and potentially bundle up inner exceptions, etc.
actually throwing an exception only adds the additional code to traverse up the stack to find the appropriate catch clause (if one exists) or transfer control to the CLRs unhandled Exception handler... This portion could be expensive for a very deep stack, but if the catch block is just at the bottom of the same method you are throwing it in, for example, then it will be relatively cheap.
If you are using exceptions to actually control the flow it can be a pretty big hit.
I was digging in some old code to see why it ran so slow. In a big loop instead of checking for null and performing a different action it caught the null exception and performed the alternative action.
So don't use exceptions for things they where not designed to do because they are slower.
Use exceptions and generally anything without worrying about performance. Then, when you are finished, measure the performance with profiling tools. If it's not acceptable, you can find the bottlenecks (which probably won't be the exception handling) and optimize.
In C# raising exceptions do have an every so slight performance hit, but this shouldn't scare you away from using them. If you have a reason, you should throw an exception. Most people who have problems with using them cite the reason being because they can disrupt the flow of a program.
Really if your reasons for not using them is a performance hit, your time can be better spent optimizing other parts of your code. I have never run into a situation where throwing an exception caused the program to behave so slowly that it had to be re-factored out (well the act of throwing the exception, not how the code treated it).
Thinking about it a little more, with all that being said, I do try and use methods which avoid throwing exceptions. If possible I'll use TryParse instead of Parse, or use KeyExists etc. If you are doing the same operation 100s of times over and throwing many exception small amounts of inefficiency can add up.
Yes. Exceptions make your program slower in C++. I created an 8086 CPU Emulator a while back. In the code I used exceptions for CPU Interrupts and Faults. I made a little test case of a big complex loop that ran for about 2 minutes doing emulated opcodes. When I ran this test through a profiler, my main loop was making a significant amount of calls to an "exception checker" function of gcc(actually there were two different functions related to this. My test code only threw one exception at the end however.) These exception functions were called in my main loop I believe every time(this is where I had the try{}catch{} part.). The exception functions cost me about 20% of my runtime speed.(the code spent 20% of it's time in there). And the exception functions were also the 3rd and 4th most called functions in the profiler...
So yes, using exceptions at all can be expensive, even without constant exception throwing.
tl;dr IMHO, Avoiding exceptions for performance reasons hits both categories of premature and micro- optimizations. Don't do it.
Ah, the religious war of exceptions.
The various types of answers to this are usually:
the usual mantra (a good one, IMHO): "use exceptions for exceptional situations" (IOW, not part of "normal" code paths).
If your normal user paths involved intentionally using exceptions as a control-flow mechanism, that's a smell.
tons of detail, without really answering the original question
if you really want detail:
http://blogs.msdn.com/cbrumme/archive/2003/10/01/51524.aspx
http://blogs.msdn.com/ricom/archive/2006/09/14/754661.aspx
etc.
someone pointing at microbenchmarks showing that something like i/j with j == 0 is 10x slower catching div-by-zero than checking j == 0
pragmatic answer of how to approach performance for apps in general
usually along the lines of:
make perf goals for your scenarios (ideally working with customers)
build it so it's maintainable, readable, and robust
run it and check perf of goal scenarios
if a set of scenarios aren't making goal, USE A PROFILER to tell you where your time is being spent and go from there.
IOW, any perf changes, especially micro-optimizations like this, made without profiling data driving that decision, is typically a huge waste of time.
Keep in mind that your perf wins will typically come from algorithmic changes (adding an index to a table to avoid table scans, moving something with large n from O(n^3) to O(n ln n), etc.).
More fun links:
http://en.wikipedia.org/wiki/Program_optimization
http://www.flounder.com/optimization.htm
If you want to know how exceptions work in Windows SEH, then I believe this article by Matt Pietrik is considered the definitive reference. It isn't light reading. If you want to extend this to how exceptions work in .NET, then you need to read this article by Chris Brumme, which is most definitely the definitive reference. It isn't light reading either.
The summary of Chris Brumme's article gives a detailed explanation as to why exception are significantly slower than using return codes. It's too long to reproduce here, and you've got a lot of reading to do before you can fully understand why.
Part of the answer is that the compiler isn't trying very hard to optimize the exceptional code path.
A catch block is a very strong hint to the compiler to agressively optimize the non-exceptional code path at the expense of the exceptional code path. To reliably hint to a compiler which branch of an if statement is the exceptional one you need profile guided optimization.
The exception object must be stored somewhere, and because throwing an exception implies stack unwinding, it can't be on the stack. The compiler knows that exceptions are rare - so the optimizer isn't going to do anything that might slow down normal execution - like keeping registers or 'fast' memory of any kind available just in case it needs to put an exception in one. You may find you get a page fault. In contrast, return codes typically end up in a register (e.g. EAX).
it's like concating strings vs stringbuilder. it's only slow if you do it a billion times.