Most of us know that a loop should not have a non-terminating condition. For example, this C# loop has a non-terminating condition: any even value of i. This is an obvious logic error.
void CountByTwosStartingAt(byte i) { // If i is even, it never exceeds 254
for(; i < 255; i += 2) {
Console.WriteLine(i);
}
}
Sometimes there are edge cases that are extremely unlikeley, but technically constitute non-exiting conditions (stack overflows and out-of-memory errors aside). Suppose you have a function that counts the number of sequential zeros in a stream:
int CountZeros(Stream s) {
int total = 0;
while(s.ReadByte() == 0) total++;
return total;
}
Now, suppose you feed it this thing:
class InfiniteEmptyStream:Stream
{
// ... Other members ...
public override int Read(byte[] buffer, int offset, int count) {
Array.Clear(buffer, offset, count); // Output zeros
return count; // Never returns -1 (end of stream)
}
}
Or more realistically, maybe a stream that returns data from external hardware, which in certain cases might return lots of zeros (such as a game controller sitting on your desk). Either way we have an infinite loop. This particular non-terminating condition stands out, but sometimes they don't.
A completely real-world example as in an app I'm writing. An endless stream of zeros will be deserialized into infinite "empty" objects (until the collection class or GC throws an exception because I've exceeded two billion items). But this would be a completely unexpected circumstance (considering my data source).
How important is it to have absolutely no non-terminating conditions? How much does this affect "robustness?" Does it matter if they are only "theoretically" non-terminating (is it okay if an exception represents an implicit terminating condition)? Does it matter whether the app is commercial? If it is publicly distributed? Does it matter if the problematic code is in no way accessible through a public interface/API?
Edit:
One of the primary concerns I have is unforseen logic errors that can create the non-terminating condition. If, as a rule, you ensure there are no non-terminating conditions, you can identify or handle these logic errors more gracefully, but is it worth it? And when? This is a concern orthogonal to trust.
You either "trust" your data source, or you don't.
If you trust it, then probably you want to make a best effort to process the data, no matter what it is. If it sends you zeros for ever, then it has posed you a problem too big for your resources to solve, and you expend all your resources on it and fail. You say this is "completely unexpected", so the question is whether it's OK for it to merely be "completely unexpected" for your application to fall over because it's out of memory. Or does it need to actually be impossible?
If you don't trust your data source, then you might want to put an artificial limit on the size of problem you will attempt, in order to fail before your system runs out of memory.
In either case it might be possible to write your app in such a way that you recover gracefully from an out-of-memory exception.
Either way it's a robustness issue, but falling over because the problem is too big to solve (your task is impossible) is usually considered more acceptable than falling over because some malicious user is sending you a stream of zeros (you accepted an impossible task from some script-kiddie DoS attacker).
Things like that have to decided on a case-by-case basis. If may make sense to have additional sanity checks, but it is too much work too make every piece of code completely foolproof; and it is not always possible to anticipate what fools come up with.
You either "trust" your data source, or you don't.
I'd say that you either "support" the software being used with that data source, or you don't. For example I've seen software which doesn't handle an insufficient-memory condition: but insufficient memory isn't "supported" for that software (or less specifically it isn't supported for that system); so, for that system, if an insufficient-memory condition occurs, the fix is to reduce the load on the system or to increase the memory (not to fix the software). For that system, handling insufficient memory isn't a requirement: what is a requirements is to manage the load put on the system, and to provide sufficient memory for that given load.
How important is it to have absolutely
no non-terminating conditions?
It isn't important at all. That is, it's not a goal by itself. The important thing is that the code correctly implements the spec. For example, an interactive shell may have a bug if the main loop does terminate.
In the scenario you're describing, the problem of infinite zeros is actually a special case of memory exhaustion. It's not a theoretical question but something that can actually happen. You should decide how to handle this.
Related
I asked a question about exceptions and I am getting VERY annoyed at people saying throwing is slow. I asked in the past How exceptions work behind the scenes and I know in the normal code path there are no extra instructions (as the accepted answer says) but I am not entirely convinced throwing is more expensive then checking return values. Consider the following:
{
int ret = func();
if (ret == 1)
return;
if (ret == 2)
return;
doSomething();
}
vs
{
try{
func();
doSomething();
}
catch (SpecificException1 e)
{
}
catch (SpecificException2 e)
{
}
}
As far as I know there isn't a difference except the ifs are moved out of the normal code path into an exception path and an extra jump or two to get to the exception code path. An extra jump or two doesn't sound like much when it reduces a few ifs in your main and more often run) code path. So are exceptions actually slow? Or is this a myth or an old issue with old compilers?
(I'm talking about exceptions in general. Specifically, exceptions in compiled languages like C++ and D; though C# was also in my mind.)
Okay - I just ran a little test to make sure that exceptions are actually slower. Summary: On my machine a call w/ return is 30 cycles per iteration. A throw w/ catch is 20370 cycles per iteration.
So to answer the question - yes - throwing exceptions is slow.
Here's the test code:
#include <stdio.h>
#include <intrin.h>
int Test1()
{
throw 1;
// return 1;
}
int main(int argc, char*argv[])
{
int result = 0;
__int64 time = 0xFFFFFFFF;
for(int i=0; i<10000; i++)
{
__int64 start = __rdtsc();
try
{
result += Test1();
}
catch(int x)
{
result += x;
}
__int64 end = __rdtsc();
if(time > end - start)
time = end - start;
}
printf("%d\n", result);
printf("time: %I64d\n", time);
}
alternative try/catch written by op
try
{
if(Test1()!=0)
result++;
}
catch(int x)
{
result++;
I don't know exactly how slow it is, but throwing an exception that already exists (say it was created by the CLR) is not much slower, cause you've already incurred the hit of constructing the exception. ... I believe it's the construction of an exception that creates the majority of the addtional performance hit ... Think about it, it has to create a stack trace, (including reading debug symbols to add lines numbers and stuff) and potentially bundle up inner exceptions, etc.
actually throwing an exception only adds the additional code to traverse up the stack to find the appropriate catch clause (if one exists) or transfer control to the CLRs unhandled Exception handler... This portion could be expensive for a very deep stack, but if the catch block is just at the bottom of the same method you are throwing it in, for example, then it will be relatively cheap.
If you are using exceptions to actually control the flow it can be a pretty big hit.
I was digging in some old code to see why it ran so slow. In a big loop instead of checking for null and performing a different action it caught the null exception and performed the alternative action.
So don't use exceptions for things they where not designed to do because they are slower.
Use exceptions and generally anything without worrying about performance. Then, when you are finished, measure the performance with profiling tools. If it's not acceptable, you can find the bottlenecks (which probably won't be the exception handling) and optimize.
In C# raising exceptions do have an every so slight performance hit, but this shouldn't scare you away from using them. If you have a reason, you should throw an exception. Most people who have problems with using them cite the reason being because they can disrupt the flow of a program.
Really if your reasons for not using them is a performance hit, your time can be better spent optimizing other parts of your code. I have never run into a situation where throwing an exception caused the program to behave so slowly that it had to be re-factored out (well the act of throwing the exception, not how the code treated it).
Thinking about it a little more, with all that being said, I do try and use methods which avoid throwing exceptions. If possible I'll use TryParse instead of Parse, or use KeyExists etc. If you are doing the same operation 100s of times over and throwing many exception small amounts of inefficiency can add up.
Yes. Exceptions make your program slower in C++. I created an 8086 CPU Emulator a while back. In the code I used exceptions for CPU Interrupts and Faults. I made a little test case of a big complex loop that ran for about 2 minutes doing emulated opcodes. When I ran this test through a profiler, my main loop was making a significant amount of calls to an "exception checker" function of gcc(actually there were two different functions related to this. My test code only threw one exception at the end however.) These exception functions were called in my main loop I believe every time(this is where I had the try{}catch{} part.). The exception functions cost me about 20% of my runtime speed.(the code spent 20% of it's time in there). And the exception functions were also the 3rd and 4th most called functions in the profiler...
So yes, using exceptions at all can be expensive, even without constant exception throwing.
tl;dr IMHO, Avoiding exceptions for performance reasons hits both categories of premature and micro- optimizations. Don't do it.
Ah, the religious war of exceptions.
The various types of answers to this are usually:
the usual mantra (a good one, IMHO): "use exceptions for exceptional situations" (IOW, not part of "normal" code paths).
If your normal user paths involved intentionally using exceptions as a control-flow mechanism, that's a smell.
tons of detail, without really answering the original question
if you really want detail:
http://blogs.msdn.com/cbrumme/archive/2003/10/01/51524.aspx
http://blogs.msdn.com/ricom/archive/2006/09/14/754661.aspx
etc.
someone pointing at microbenchmarks showing that something like i/j with j == 0 is 10x slower catching div-by-zero than checking j == 0
pragmatic answer of how to approach performance for apps in general
usually along the lines of:
make perf goals for your scenarios (ideally working with customers)
build it so it's maintainable, readable, and robust
run it and check perf of goal scenarios
if a set of scenarios aren't making goal, USE A PROFILER to tell you where your time is being spent and go from there.
IOW, any perf changes, especially micro-optimizations like this, made without profiling data driving that decision, is typically a huge waste of time.
Keep in mind that your perf wins will typically come from algorithmic changes (adding an index to a table to avoid table scans, moving something with large n from O(n^3) to O(n ln n), etc.).
More fun links:
http://en.wikipedia.org/wiki/Program_optimization
http://www.flounder.com/optimization.htm
If you want to know how exceptions work in Windows SEH, then I believe this article by Matt Pietrik is considered the definitive reference. It isn't light reading. If you want to extend this to how exceptions work in .NET, then you need to read this article by Chris Brumme, which is most definitely the definitive reference. It isn't light reading either.
The summary of Chris Brumme's article gives a detailed explanation as to why exception are significantly slower than using return codes. It's too long to reproduce here, and you've got a lot of reading to do before you can fully understand why.
Part of the answer is that the compiler isn't trying very hard to optimize the exceptional code path.
A catch block is a very strong hint to the compiler to agressively optimize the non-exceptional code path at the expense of the exceptional code path. To reliably hint to a compiler which branch of an if statement is the exceptional one you need profile guided optimization.
The exception object must be stored somewhere, and because throwing an exception implies stack unwinding, it can't be on the stack. The compiler knows that exceptions are rare - so the optimizer isn't going to do anything that might slow down normal execution - like keeping registers or 'fast' memory of any kind available just in case it needs to put an exception in one. You may find you get a page fault. In contrast, return codes typically end up in a register (e.g. EAX).
it's like concating strings vs stringbuilder. it's only slow if you do it a billion times.
It goes without saying that using hard-coded, hex literal pointers is a disaster:
int *i = 0xDEADBEEF;
// god knows if that location is available
However, what exactly is the danger in using hex literals as variable values?
int i = 0xDEADBEEF;
// what can go wrong?
If these values are indeed "dangerous" due to their use in various debugging scenarios, then this means that even if I do not use these literals, any program that during runtime happens to stumble upon one of these values might crash.
Anyone care to explain the real dangers of using hex literals?
Edit: just to clarify, I am not referring to the general use of constants in source code. I am specifically talking about debug-scenario issues that might come up to the use of hex values, with the specific example of 0xDEADBEEF.
There's no more danger in using a hex literal than any other kind of literal.
If your debugging session ends up executing data as code without you intending it to, you're in a world of pain anyway.
Of course, there's the normal "magic value" vs "well-named constant" code smell/cleanliness issue, but that's not really the sort of danger I think you're talking about.
With few exceptions, nothing is "constant".
We prefer to call them "slow variables" -- their value changes so slowly that we don't mind recompiling to change them.
However, we don't want to have many instances of 0x07 all through an application or a test script, where each instance has a different meaning.
We want to put a label on each constant that makes it totally unambiguous what it means.
if( x == 7 )
What does "7" mean in the above statement? Is it the same thing as
d = y / 7;
Is that the same meaning of "7"?
Test Cases are a slightly different problem. We don't need extensive, careful management of each instance of a numeric literal. Instead, we need documentation.
We can -- to an extent -- explain where "7" comes from by including a tiny bit of a hint in the code.
assertEquals( 7, someFunction(3,4), "Expected 7, see paragraph 7 of use case 7" );
A "constant" should be stated -- and named -- exactly once.
A "result" in a unit test isn't the same thing as a constant, and requires a little care in explaining where it came from.
A hex literal is no different than a decimal literal like 1. Any special significance of a value is due to the context of a particular program.
I believe the concern raised in the IP address formatting question earlier today was not related to the use of hex literals in general, but the specific use of 0xDEADBEEF. At least, that's the way I read it.
There is a concern with using 0xDEADBEEF in particular, though in my opinion it is a small one. The problem is that many debuggers and runtime systems have already co-opted this particular value as a marker value to indicate unallocated heap, bad pointers on the stack, etc.
I don't recall off the top of my head just which debugging and runtime systems use this particular value, but I have seen it used this way several times over the years. If you are debugging in one of these environments, the existence of the 0xDEADBEEF constant in your code will be indistinguishable from the values in unallocated RAM or whatever, so at best you will not have as useful RAM dumps, and at worst you will get warnings from the debugger.
Anyhow, that's what I think the original commenter meant when he told you it was bad for "use in various debugging scenarios."
There's no reason why you shouldn't assign 0xdeadbeef to a variable.
But woe betide the programmer who tries to assign decimal 3735928559, or octal 33653337357, or worst of all: binary 11011110101011011011111011101111.
Big Endian or Little Endian?
One danger is when constants are assigned to an array or structure with different sized members; the endian-ness of the compiler or machine (including JVM vs CLR) will affect the ordering of the bytes.
This issue is true of non-constant values, too, of course.
Here's an, admittedly contrived, example. What is the value of buffer[0] after the last line?
const int TEST[] = { 0x01BADA55, 0xDEADBEEF };
char buffer[BUFSZ];
memcpy( buffer, (void*)TEST, sizeof(TEST));
I don't see any problem with using it as a value. Its just a number after all.
There's no danger in using a hard-coded hex value for a pointer (like your first example) in the right context. In particular, when doing very low-level hardware development, this is the way you access memory-mapped registers. (Though it's best to give them names with a #define, for example.) But at the application level you shouldn't ever need to do an assignment like that.
I use CAFEBABE
I haven't seen it used by any debuggers before.
int *i = 0xDEADBEEF;
// god knows if that location is available
int i = 0xDEADBEEF;
// what can go wrong?
The danger that I see is the same in both cases: you've created a flag value that has no immediate context. There's nothing about i in either case that will let me know 100, 1000 or 10000 lines that there is a potentially critical flag value associated with it. What you've planted is a landmine bug that, if I don't remember to check for it in every possible use, I could be faced with a terrible debugging problem. Every use of i will now have to look like this:
if (i != 0xDEADBEEF) { // Curse the original designer to oblivion
// Actual useful work goes here
}
Repeat the above for all of the 7000 instances where you need to use i in your code.
Now, why is the above worse than this?
if (isIProperlyInitialized()) { // Which could just be a boolean
// Actual useful work goes here
}
At a minimum, I can spot several critical issues:
Spelling: I'm a terrible typist. How easily will you spot 0xDAEDBEEF in a code review? Or 0xDEADBEFF? On the other hand, I know that my compile will barf immediately on isIProperlyInitialised() (insert the obligatory s vs. z debate here).
Exposure of meaning. Rather than trying to hide your flags in the code, you've intentionally created a method that the rest of the code can see.
Opportunities for coupling. It's entirely possible that a pointer or reference is connected to a loosely defined cache. An initialization check could be overloaded to check first if the value is in cache, then to try to bring it back into cache and, if all that fails, return false.
In short, it's just as easy to write the code you really need as it is to create a mysterious magic value. The code-maintainer of the future (who quite likely will be you) will thank you.
In several modern programming languages (including C++, Java, and C#), the language allows integer overflow to occur at runtime without raising any kind of error condition.
For example, consider this (contrived) C# method, which does not account for the possibility of overflow/underflow. (For brevity, the method also doesn't handle the case where the specified list is a null reference.)
//Returns the sum of the values in the specified list.
private static int sumList(List<int> list)
{
int sum = 0;
foreach (int listItem in list)
{
sum += listItem;
}
return sum;
}
If this method is called as follows:
List<int> list = new List<int>();
list.Add(2000000000);
list.Add(2000000000);
int sum = sumList(list);
An overflow will occur in the sumList() method (because the int type in C# is a 32-bit signed integer, and the sum of the values in the list exceeds the value of the maximum 32-bit signed integer). The sum variable will have a value of -294967296 (not a value of 4000000000); this most likely is not what the (hypothetical) developer of the sumList method intended.
Obviously, there are various techniques that can be used by developers to avoid the possibility of integer overflow, such as using a type like Java's BigInteger, or the checked keyword and /checked compiler switch in C#.
However, the question that I'm interested in is why these languages were designed to by default allow integer overflows to happen in the first place, instead of, for example, raising an exception when an operation is performed at runtime that would result in an overflow. It seems like such behavior would help avoid bugs in cases where a developer neglects to account for the possibility of overflow when writing code that performs an arithmetic operation that could result in overflow. (These languages could have included something like an "unchecked" keyword that could designate a block where integer overflow is permitted to occur without an exception being raised, in those cases where that behavior is explicitly intended by the developer; C# actually does have this.)
Does the answer simply boil down to performance -- the language designers didn't want their respective languages to default to having "slow" arithmetic integer operations where the runtime would need to do extra work to check whether an overflow occurred, on every applicable arithmetic operation -- and this performance consideration outweighed the value of avoiding "silent" failures in the case that an inadvertent overflow occurs?
Are there other reasons for this language design decision as well, other than performance considerations?
In C#, it was a question of performance. Specifically, out-of-box benchmarking.
When C# was new, Microsoft was hoping a lot of C++ developers would switch to it. They knew that many C++ folks thought of C++ as being fast, especially faster than languages that "wasted" time on automatic memory management and the like.
Both potential adopters and magazine reviewers are likely to get a copy of the new C#, install it, build a trivial app that no one would ever write in the real world, run it in a tight loop, and measure how long it took. Then they'd make a decision for their company or publish an article based on that result.
The fact that their test showed C# to be slower than natively compiled C++ is the kind of thing that would turn people off C# quickly. The fact that your C# app is going to catch overflow/underflow automatically is the kind of thing that they might miss. So, it's off by default.
I think it's obvious that 99% of the time we want /checked to be on. It's an unfortunate compromise.
I think performance is a pretty good reason. If you consider every instruction in a typical program that increments an integer, and if instead of the simple op to add 1, it had to check every time if adding 1 would overflow the type, then the cost in extra cycles would be pretty severe.
You work under the assumption that integer overflow is always undesired behavior.
Sometimes integer overflow is desired behavior. One example I've seen is representation of an absolute heading value as a fixed point number. Given an unsigned int, 0 is 0 or 360 degrees and the max 32 bit unsigned integer (0xffffffff) is the biggest value just below 360 degrees.
int main()
{
uint32_t shipsHeadingInDegrees= 0;
// Rotate by a bunch of degrees
shipsHeadingInDegrees += 0x80000000; // 180 degrees
shipsHeadingInDegrees += 0x80000000; // another 180 degrees, overflows
shipsHeadingInDegrees += 0x80000000; // another 180 degrees
// Ships heading now will be 180 degrees
cout << "Ships Heading Is" << (double(shipsHeadingInDegrees) / double(0xffffffff)) * 360.0 << std::endl;
}
There are probably other situations where overflow is acceptable, similar to this example.
C/C++ never mandate trap behaviour. Even the obvious division by 0 is undefined behaviour in C++, not a specified kind of trap.
The C language doesn't have any concept of trapping, unless you count signals.
C++ has a design principle that it doesn't introduce overhead not present in C unless you ask for it. So Stroustrup would not have wanted to mandate that integers behave in a way which requires any explicit checking.
Some early compilers, and lightweight implementations for restricted hardware, don't support exceptions at all, and exceptions can often be disabled with compiler options. Mandating exceptions for language built-ins would be problematic.
Even if C++ had made integers checked, 99% of programmers in the early days would have turned if off for the performance boost...
Because checking for overflow takes time. Each primitive mathematical operation, which normally translates into a single assembly instruction would have to include a check for overflow, resulting in multiple assembly instructions, potentially resulting in a program that is several times slower.
It is likely 99% performance. On x86 would have to check the overflow flag on every operation which would be a huge performance hit.
The other 1% would cover those cases where people are doing fancy bit manipulations or being 'imprecise' in mixing signed and unsigned operations and want the overflow semantics.
Backwards compatibility is a big one. With C, it was assumed that you were paying enough attention to the size of your datatypes that if an over/underflow occurred, that that was what you wanted. Then with C++, C# and Java, very little changed with how the "built-in" data types worked.
If integer overflow is defined as immediately raising a signal, throwing an exception, or otherwise deflecting program execution, then any computations which might overflow will need to be performed in the specified sequence. Even on platforms where integer overflow checking wouldn't cost anything directly, the requirement that integer overflow be trapped at exactly the right point in a program's execution sequence would severely impede many useful optimizations.
If a language were to specify that integer overflows would instead set a latching error flag, were to limit how actions on that flag within a function could affect its value within calling code, and were to provide that the flag need not be set in circumstances where an overflow could not result in erroneous output or behavior, then compilers could generate more efficient code than any kind of manual overflow-checking programmers could use. As a simple example, if one had a function in C that would multiply two numbers and return a result, setting an error flag in case of overflow, a compiler would be required to perform the multiplication whether or not the caller would ever use the result. In a language with looser rules like I described, however, a compiler that determined that nothing ever uses the result of the multiply could infer that overflow could not affect a program's output, and skip the multiply altogether.
From a practical standpoint, most programs don't care about precisely when overflows occur, so much as they need to guarantee that they don't produce erroneous results as a consequence of overflow. Unfortunately, programming languages' integer-overflow-detection semantics have not caught up with what would be necessary to let compilers produce efficient code.
My understanding of why errors would not be raised by default at runtime boils down to the legacy of desiring to create programming languages with ACID-like behavior. Specifically, the tenet that anything that you code it to do (or don't code), it will do (or not do). If you didn't code some error handler, then the machine will "assume" by virtue of no error handler, that you really want to do the ridiculous, crash-prone thing you're telling it to do.
(ACID reference: http://en.wikipedia.org/wiki/ACID)
The word seems to get used in a number of contexts. The best I can figure is that they mean a variable that can't change. Isn't that what constants/finals (darn you Java!) are for?
An invariant is more "conceptual" than a variable. In general, it's a property of the program state that is always true. A function or method that ensures that the invariant holds is said to maintain the invariant.
For instance, a binary search tree might have the invariant that for every node, the key of the node's left child is less than the node's own key. A correctly written insertion function for this tree will maintain that invariant.
As you can tell, that's not the sort of thing you can store in a variable: it's more a statement about the program. By figuring out what sort of invariants your program should maintain, then reviewing your code to make sure that it actually maintains those invariants, you can avoid logical errors in your code.
It is a condition you know to always be true at a particular place in your logic and can check for when debugging to work out what has gone wrong.
The magic of wikipedia: Invariant (computer science)
In computer science, a predicate that,
if true, will remain true throughout a
specific sequence of operations, is
called (an) invariant to that
sequence.
This answer is for my 5 year old kid. Do not think of an invariant as a constant or fixed numerical value. But it can be. However, it is more than that.
Rather, an invariant is something like of a fixed relationship between varying entities. For example, your age will always be less than that compared to your biological parents. Both your age, and your parent's age changes in the passage of time, but the relationship that i mentioned above is an invariant.
An invariant can also be a numerical constant. For example, the value of pi is an invariant ratio between the circle's circumference over its diameter. No matter how big or small the circle is, that ratio will always be pi.
I usually view them more in terms of algorithms or structures.
For example, you could have a loop invariant that could be asserted--always true at the beginning or end of each iteration. That is, if your loop was supposed to process a collection of objects from one stack to another, you could say that |stack1|+|stack2|=c, at the top or bottom of the loop.
If the invariant check failed, it would indicate something went wrong. In this example, it could mean that you forgot to push the processed element onto the final stack, etc.
As this line states:
In computer science, a predicate that, if true, will remain true throughout a specific sequence of operations, is called (an) invariant to that sequence.
To better understand this hope this example in C++ helps.
Consider a scenario where you have to get some values and get the total count of them in a variable called as count and add them in a variable called as sum
The invariant (again it's more like a concept):
// invariant:
// we have read count grades so far, and
// sum is the sum of the first count grades
The code for the above would be something like this,
int count=0;
double sum=0,x=0;
while (cin >> x) {
++count;
sum+=x;
}
What the above code does?
1) Reads the input from cin and puts them in x
2) After one successful read, increment count and sum = sum + x
3) Repeat 1-2 until read stops ( i.e ctrl+D)
Loop invariant:
The invariant must be True ALWAYS. So initially you start out your code with just this
while(cin>>x){
}
This loop reads data from standard input and stores in x. Well and good. But the invariant becomes false because the first part of our invariant wasn't followed (or kept true).
// we have read count grades so far, and
How to keep the invariant true?
Simple! increment count.
So ++count; would do good!. Now our code becomes something like this,
while(cin>>x){
++count;
}
But
Even now our invariant (a concept which must be TRUE) is False because now we didn't satisfy the second part of our invariant.
// sum is the sum of the first count grades
So what to do now?
Add x to sum and store it in sum ( sum+=x) and the next time
cin>>x will read a new value into x.
Now our code becomes something like this,
while(cin>>x){
++count;
sum+=x;
}
Let's check
Whether code matches our invariant
// invariant:
// we have read count grades so far, and
// sum is the sum of the first count grades
code:
while(cin>>x){
++count;
sum+=x;
}
Ah!. Now the loop invariant is True always and code works fine.
The above example was taken and modified from the book Accelerated C++ by Andrew-koening and Barbara-E
Something that doesn't change within a block of code
All the answers here are great, but i felt that i can shed more light on the matter:
Invariant from a language point of view means something that never changes. The concept though comes actually from math, it's one of the popular proof techniques when combined with induction.
Here is how a proof goes, If you can find an invariant that is in the initial state, And that this invariant persists regardless of any [legal] transformation applied to the state, then you can prove that If a certain state does not have this invariant then it can never occur, no matter what sequence of transformations are applied to the initial state.
Now the previous way of thinking (again combined with induction) makes it possible to predicate the logic of computer software. Especially important when the execution goes in loops, in which an invariant can be used to prove that a certain loop will yield a certain result or that it will never change the state of a program in a certain way.
When invariant is used to predicate a loop logic its called loop invariant. It can be used outside loops, but for loops it is really important, because you often have a lot of possibilities, or an infinite number of possibilities.
Notice that i use the word "predicate" the logic of a computer software, and not prove. And that's because while in math invariant can be used as a proof, it can never prove that the computer software when executed will yield what is expected, due to the fact that the software is executed on top of many abstractions, that can never be proved that they will yield what is expected (think of the hardware abstraction for example).
Finally while theoretically and rigorously predicting software logic is only important for high critical applications like Medical, and Military ones. Invariant can still be used to aid the typical programmer when debugging. It can be used to know where at a certain location The program failed because it has failed to maintain a certain invariant - many of us use it anyway without giving a thought about it.
Class Invariant
Class Invariant is a condition which should be always true before and after calling relevant function
For example balanced tree has an Invariant which is called isBalanced. When you modify your tree through some methods (e.g. addNode, removeNode...) - isBalanced should be always true before and after modifying the tree
Following on from what it is, invariants are quite useful in writing clean code, since knowing conceptually what invariants should be present in your code allows you to easily decide how to organize your code to reach those aims. As mentioned ealier, they're also useful in debugging, as checking to see if the invariant's being maintained is often a good way of seeing if whatever manipulation you're attempting to perform is actually doing what you want it to.
It's typically a quantity that does not change under certain mathematical operations.
An example is a scalar, which does not change under rotations. In magnetic resonance imaging, for example, it is useful to characterize a tissue property by a rotational invariant, because then its estimation ideally does not depend on the orientation of the body in the scanner.
The ADT invariant specifes relationships
among the data fields (instance variables)
that must always be true before and after
the execution of any instance method.
There is an excellent example of an invariant and why it matters in the book Java Concurrency in Practice.
Although Java-centric, the example describes some code that is responsible for calculating the factors of a provided integer. The example code attempts to cache the last number provided, and the factors that were calculated to improve performance. In this scenario there is an invariant that was not accounted for in the example code which has left the code susceptible to race conditions in a concurrent scenario.
What is the single most effective practice to prevent arithmetic overflow and underflow?
Some examples that come to mind are:
testing based on valid input ranges
validation using formal methods
use of invariants
detection at runtime using language features or libraries (this does not prevent it)
One possibility is to use a language that has arbitrarily sized integers that never overflow / underflow.
Otherwise, if this is something you're really concerned about, and if your language allows it, write a wrapper class that acts like an integer, but checks every operation for overflow. You could even have it do the check on debug builds, and leave things optimized for release builds. In a language like C++, you could do this, and it would behave almost exactly like an integer for release builds, but for debug builds you'd get full run-time checking.
class CheckedInt
{
private:
int Value;
public:
// Constructor
CheckedInt(int src) : Value(src) {}
// Conversions back to int
operator int&() { return Value; }
operator const int &() const { return Value; }
// Operators
CheckedInt operator+(CheckedInt rhs) const
{
if (rhs.Value < 0 && rhs.Value + Value > Value)
throw OverflowException();
if (rhs.Value > 0 && rhs.Value + Value < Value)
throw OverflowException();
return CheckedInt(rhs.Value + Value);
}
// Lots more operators...
};
Edit:
Turns out someone is doing this already for C++ - the current implementation is focused for Visual Studio, but it looks like they're getting support for gcc as well.
I write a lot of test code to do range/validity checking on my code. This tends to catch most of these types of situations - and definitely helps me write more bulletproof code.
Use high precision floating point numbers like a long double.
I think you are missing one very important option in your list: choose the right programming language for the job. There are many programming languages which do not have these problems, because they don't have fixed size integers.
There are more important considerations when choosing which language you use than the size of the integer. Simply check your input if you don't know if the value is in bounds, or use exception handling if the case is extremely rare.
A wrapper that checks for inconsistencies will make sense in many cases. If an additive operation (ie, addition or multiplication) on two or more integers results in a smaller value than the operands then you know something went wrong. Every additive operation should be followed by,
if (sum < operand1 || sum < operand2)
omg_error();
Likewise any operation that should logically result in a smaller value should be check to see if it was accidentally embiggin'd.
Have you investigated the use of formal methods to check your code to prove that it is free of overflows? A formal methods technique known as abstract interpretation can check the robustness of your software to prove that your software will not suffer from an overflow, underflow, divide by zero, overflow, or other similar run-time error. It is a mathematical technique that exhaustively analyzes your software. The technique was pioneered by Patrick Cousot in the 1970s. It was successfully used to diagnose an overflow condition in the Arian 5 rocket where an overflow caused the destruction of the launch vehicle. The overflow was caused while converting a floating point number to an integer. You can find more information about this technique here and also on Wikipedia.