Basically I want to know how do you differentiate an error from an exception. In some programming languages accessing a non existent file throws an error and in others its an exception. How do you know if some thing is an error or an exception?
Like anything else - you either test it or read the documentation. It can be an "Error" or an "Exception" based on the language.
Eg.
C:
Crashes and gives a divide by zero error.
Ruby:
>> 6 / 0
ZeroDivisionError: divided by 0
from (irb):1:in `/'
from (irb):1
(ZeroDivisionError is actually an exception.)
Java:
Code:
int x = 6 / 0;
Output:
Exception in thread "main" java.lang.ArithmeticException: / by zero
It depends on the language :
some languages don't have exceptions
some languages don't use exceptions for everything.
For example, in PHP :
There are exceptions
But divide by 0 doesn't cause an exception to be thrown : is only raises a warning -- that doesn't stop the execution of the script.
The following portion of code :
echo 10 / 0;
echo "hello, world!";
Would give this result :
Warning: Division by zero in /.../temp.php on line 5
hello, world!
The terms error and exception are commonly used as jargon terms, with meanings that vary depending upon the programming ecosystem in which they are used.
Conditions
This response follows the lead of Common Lisp, and adopts the term condition as a nonjudgmental way of referring to an "interesting situation" in a program.
What makes a program condition "interesting"? Let's consider the division-by-zero case for real numbers. In the overwhelming majority of cases in which one real is divided by another, the result is another plain ordinary well-behaved real number. These are the "routine" or "uninteresting" cases. However, in the case that the divisor is zero then, mathematically speaking, the result is undefined. The program is now in an "interesting" or "exceptional" condition.
It becomes even more complicated once we take the mathematical ideal of a real number and model it, say, as an IEEE-format floating point number. If we divide 1.0 / 0.0, the IEEE standard (mostly) says that the result is in fact another floating point number, the quiet NaN Infinity. Since the result no longer behaves in the same way as a plain old real number, the program condition is once again "interesting" or "exceptional".
Classifying Conditions
The question is: what should we do when we run into an interesting condition? The answer is dependent upon the context. When classifying program conditions, the following questions are useful:
How likely is it that the condition will occur: certain, probable, unlikely, impossible?
How is the condition detected: program malfunction, distinguished value, signal/handler (aka exception handling), program termination?
How should the condition be handled: ignore it, perform some special action, terminate the program?
The answers to these questions yield 4 x 4 x 3 = 48 distinct cases -- and surely more could be distinguished by further criteria. This brings us to the heart of the matter. We have more than two cases but only two labels, error and exception, to apply to them. Needless to say, there are many possible ways to divide the 48+ cases into two groups.
For example, one could say that anything involving program malfunction is an error, anything else is an exception. Or that anything involving a language's built-in exception handling facilities is an exception, anything else is an error. The possibilities are legion.
Examples
End-Of-File
When reading and processing a stream of characters, hitting the end-of-file is certain. In C, this event is detected by means of a distinguished return value from an I/O function, a so-called error return value. Thus, one speaks of an EOF error.
Division-By-Zero
When dividing two user-entered numbers in a simple calculator program, we want to give a meaningful result even if the user enters a divisor of zero. In some C environments, division-by-zero results in a signal (SIGFPE) that must be fielded by a signal handler. Signals are sometimes called exceptions in the C community and, confusingly, sometimes called program error signals. In other C environments, IEEE floating-point rules apply and the division-by-zero would result in a NaN value. The C environment would be blissfully unaware of that value, considering it to be neither an exception nor an error.
Runtime Load Failure
Programs frequently load their program code dynamically at run-time (e.g. classes, DLLs). This might fail due to a missing file. C offers no standard way to detect or recover from this case. The program would be terminated involuntarily, and one often speaks of this situation as a fatal exception. In Java, this would be termed a linkage error.
Java's Throwable Hierarchy
Java's exception-handling system divides the so-called Throwable class hierarchy into two main groups. Subclasses of Error are meant to represent conditions from which recovery is impossible. Subclasses of Exception are meant for recoverable conditions are are further subdivided into checked exceptions (for probable conditions) and unchecked exceptions (for unlikely conditions). Unfortunately, the boundaries between these categories are poorly defined and you will often find instances of throwables whose semantics suggest that they belong in a different category.
Be Wary Of Jargon
These examples show that the meanings of error and exception are murky at best. One must treat error and exception as jargon, whose meaning is determined by the context of discussion.
Of greater value are distinguishing characteristics of program conditions. What is the likelihood of the condition occurring? How is the condition detected? What action should be taken when the condition is detected? In any discussion that demands clarity, one is better suited to answer these questions directly rather than relying upon jargon terminology.
Exceptions should indicate exceptional activity, so if you reach a point in your code for which you've done your best to avoid divide by zero, then throwing an exception (if you are able to in your language) is the right way.
If it's routine logic to check for divide by zero (like for a calculator app) then you should check for that in your code before it has the chance to raise an exception. In that case, it's an error (in user input) and should be handled as such.
(Stole this idea either from The Pragmatic Programmer or Code Complete; can't remember which.)
Related
In Squeak System Browser some messages have a red flag on left side.
I saw from the Balloon message that it is because I have an interruption in the code, that is a self halt: 'foobar' or a self error:'foobar'.
Is this so bad ? How can avoid it ? I put the error message where something particularly bad happened and going further makes no much sense. Like a failure in authentication, a failure on getting data from the networks and so on.
I would like to do something like rising an exception in these particular occasions, but, if possible, I don't want to see red flags on half of my methods.
Is there a standard practice to do it ?
Halts
A #halt is something you use for debugging purposes. In most of the cases, you insert a #halt when you want to reach a point in the execution flow and continue from there using the debugger, stepping and inspecting the involved objects. You can also want to add a #halt to see whether certain method gets invoked or not, so to better understand what's actually happening when you evaluate some expression. In all these cases the #halt should be removed as soon as your debugging finished.
As a typical example, imagine you are debugging an algorithm and you need to better understand why it fails. Then you insert a #halt:
computeDiagonal: k
| product akk diff |
product := self dotProductLimitedTo: k withRow: k.
akk := matrix atRow: k column: k.
diff := akk - product.
diff < 0.0 ifTrue: [
state := #fail.
^self halt]. "wait a minute!"
lower atRow: k column: k put: diff sqrt
Assertions
There are cases, however, where your investigation wasn't conclusive enough, or the issue you are analyzing is not reproducible. It would then be a good idea to leave some longer-term indication that something should not be happening, or if it does, offer an opportunity to better understand its cause. In these situations a halt could work but may not be expressive enough (you are no longer immersed in the original problem) so you might want to consider and #assert: or #deny: instead. These messages, which are usually sent in unit tests, can also be present in any method and will convey a clearer intention.
Note that the decision to use #halt or #assert: doesn't depend on the method, but on the state of maturity of your model. For instance, if you aren't quite sure the algorithm won't fail again, but you cannot reproduce a failure, you should replace the #halt with an assert: or #deny:
computeDiagonal: k
| product akk diff |
product := self dotProductLimitedTo: k withRow: k.
akk := matrix atRow: k column: k.
diff := akk - product.
self deny: diff < 0.0. "got you!"
lower atRow: k column: k put: diff sqrt
Errors
Finally, if you are pretty sure something should/shouldn't happen, #error:: is your best choice. The difference between halt, assert: and error: is that the latter is for end users while the others are for developers.
computeDiagonal: k
| product akk diff |
product := self dotProductLimitedTo: k withRow: k.
akk := matrix atRow: k column: k.
diff := akk - product.
diff < 0.0 ifTrue: [self error: 'Cholesky decomposition failed']. "Oh oh..."
lower atRow: k column: k put: diff sqrt
Of course, to take full advantage of the Exception framework, you might want to consider adding your own version of the #error: message, so that it would signal a specific subclass of Exception, rather than the generic one. There are plenty of examples in the system for you to get inspiration. This is not always necessary (or good), it is just something to think about.
Note also that an Error may be resumable, so do not associate them with aborting strategies. In fact, #halt and #assert: do signal resumable exceptions.
Conclusion
The debugger is your best friend, and the the #halt message will bring it anywhere in your code. However, leaving a #halt in code that has been published will be interpreted as an indication of unfinished work.
Assertions may help other developers to better understand how to use your objects. But please, resist the temptation of being too assertive.
Errors are an elegant way of declaring unexpected behavior in a way that would allow the developer (you) to have a clue on what might have happened. Don't think of errors as text messages, errors in Smalltalk are first class objects that may contain valuable information.
As you have figured out the red flag means there is some kind of halt in the message. This is fine if you have a development code where you need to halt the execution to check the state.
That being said such code does not belong to a production code. This should be replaced by exceptions.
In squeak the following exceptions (ANSI-Compatible Exceptions):
Evaluating Blocks with Exceptions
Methods for handling Exceptions raised in a BlockContext
Message: ensure: aTerminationBlock
Description: Evaluate
aTerminationBlock after evaluating the receiver, regardless of whether
the receiver's evaluation completes.
Message: ifCurtailed: aTerminationBlock
Description: Evaluate the
receiver. If it terminates abnormally, evaluate aTerminationBlock.
Message: on: exception do: handlerActionBlock
Description:
Evaluate the receiver in the scope of an exception handler,
handlerActionBlock.
Examples
["target code, which may abort"]
ensure:
["code that will always be executed
after the target code,
whatever whatever may happen"]
["target code, which may abort"]
ifCurtailed:
["code that will be executed
whenever the target code terminates
without a normal return"]
["target code, which may abort"]
on: Exception
do: [:exception |
"code that will be executed whenever
the identified Exception is signaled."]
The source of the information is Squeak Smalltalk: Classes Reference.
Is there a way to ensure that:
if a==b then devfun(a)==devfun(b);
where devfun() is a device function involves some floating point maths ops (e.g. polynomials) and returns floating point results, a and b are floating point variables.
I don't care about cross-implentation consistence (e.g. different compiler/different OS/different driver versions or different compiler options), I only care about, within the same building/program, at runtime, can it ensure that during each function call, the result returned by devfun() are consistent in a way such that as long as a==b, devfun(a)==devfun(b)?
I am talking about SM2.0+ hardware and CUDA 5.0+, just in case being relevant.
Let's assume that your numbers a and b represent properly normalized IEEE-754 representation floating point numbers and that niether a nor b is a NaN value. Let's also assume a and b are both 32-bit, or else a and b are both 64-bit (IEEE-754 floating point representations).
In that case, I believe the (ISO C/C++, or CUDA C/C++) floating point test for equality (==) will return TRUE when the two numbers a and b are bitwise identical (and FALSE otherwise).
Under the TRUE case, with one exception, I believe it is safe to assume that devfun(a) == devfun(b) without any additional conditions except the obvious ones: there is no difference in the behavior of devfun on either side of the == operation, that is, it's the same code, compiled in the same way, executed under the same conditions (e.g. other variables that may be taking part in devfun, same GPU type, etc.), just as you've indicated in your question: "same building/program".
The one exception is if the result of devfun(a) is NaN, since (IEEE-754) NaN != NaN.
It would be interesting (to me) if you think you have a piece of code that disproves this assertion.
Perhaps floating point ninjas will come along and correct me.
Perhaps also I would be remiss if I did not say something about the hazards of floating point comparisons. If you're not familiar with this (most folks would never recommend performing a test a==b on two floating point numbers) you can find many questions about it on SO.
For the same reasons that floating point equality comparison (==) in general is unwise, I think relying on the above assertion, even if it's true, is unwise. Let me give you one example.
Suppose you compile code for architecture sm_20. Now you run the code on an sm_21 device. This one simple variation could result in a JIT-compile at runtime. Now you are no longer running the same code, and all bets are off.
So, again, even if the above is true, I think it's unwise for you to rely on such a statement:
if a==b, then devfun(a) == devfun(b)
I've been interested in compiler/interpreter design/implementation for as long as I've been programming (only 5 years now) and it's always seemed like the "magic" behind the scenes that nobody really talks about (I know of at least 2 forums for operating system development, but I don't know of any community for compiler/interpreter/language development). Anyways, recently I've decided to start working on my own, in hopes to expand my knowledge of programming as a whole (and hey, it's pretty fun :). So, based off the limited amount of reading material I have, and Wikipedia, I've developed this concept of the components for a compiler/interpreter:
Source code -> Lexical Analysis -> Abstract Syntax Tree -> Syntactic Analysis -> Semantic Analysis -> Code Generation -> Executable Code.
(I know there's more to code generation and executable code, but I haven't gotten that far yet :)
And with that knowledge, I've created a very basic lexer (in Java) to take input from a source file, and output the tokens into another file. A sample input/output would look like this:
Input:
int a := 2
if(a = 3) then
print "Yay!"
endif
Output (from lexer):
INTEGER
A
ASSIGN
2
IF
L_PAR
A
COMP
3
R_PAR
THEN
PRINT
YAY!
ENDIF
Personally, I think it would be really easy to go from there to syntactic/semantic analysis, and possibly even code generation, which leads me to question: Why use an AST, when it seems that my lexer is doing just as good a job? However, 100% of my sources I use to research this topic all seem adamant that this is a necessary part of any compiler/interpreter. Am I missing the point of what an AST really is (a tree that shows the logical flow of a program)?
TL;DR: Currently in route to develop a compiler, finished the lexer, seems to me like the output would make for easy syntactic analysis/semantic analysis, rather than doing an AST. So why use one? Am I missing the point of one?
Thanks!
First off, one thing about your list of components does not make sense. Building an AST is (pretty much) the syntactic analysis, so it either shouldn't be in there, or at least come before the AST.
What you got there is a lexer. All it gives you are individual tokens. In any case, you will need an actual parser, because regular languages aren't any fun to program in. You can't even (properly) nest expressions. Heck, you can't even handle operator precedence. A token stream doesn't give you:
An idea where statements and expressions start and end.
An idea how statements are grouped into blocks.
An idea Which part of the expression has which precedence, associativity, etc.
A clear, uncluttered view at the actual structure of the program.
A structure which can be passed through a myriad of transformations, without every single pass knowing and having code to accomodate that the condition in an if is enclosed by parentheses.
... more generally, any kind of comprehension above the level of a single token.
Suppose you have two passes in your compiler which optimize certain kinds of operators applies to certain arguments (say, constant folding and algebraic simplifications like x - x -> 0). If you hand them tokens for the expression x - x * 1, these passes are cluttered with figuring out that the x * 1 part comes first. And they have to know that, lest the transformation is incorrect (consider 1 + 2 * 3).
These things are tricky enough to get right as it is, so you don't want to be pestered by parsing problems as well. That's why you solve the parsing problem first, in a separate parsing step. Then you can, say, replace a function call with its definition, without worrying about adding parenthesis so the meaning remains the same. You save time, you separate concerns, you avoid repetition, you enable simpler code in many other places, etc.
A parser figures all that out, and builds an AST which consequently holds all that information. Without any further data on the nodes, the shape of the AST alone gives you no. 1, 2, 3, and much more, for free. None of the bazillion passes that follow have to worry about it anymore.
That's not to say you always have to have an AST. For sufficiently simple languages, you can do a single-pass compiler. Instead of generating an AST or some other intermediate representation during parsing, you emit code as you go. However, this becomes harder for less simple languages and you can't reasonably do a lot of stuff (such as 70% of all optimizations and diagnostics -- and yes I just made that number up). Generally, I wouldn't advise you to do this. There are good reasons single-pass compilers are mostly dead. Even languages which permit them (e.g. C) are nowadays implemented with multiple passes and ASTs. It's a simple way to get started, but will severely limit you (and the language, if you design it) later.
You've got the AST at the wrong point in your flow diagram. Typically, the output of the lexer is a series of tokens (as you have in your output), and these are fed to the parser/syntactic analyzer, which generates the AST. So the output of your lexer is different from an AST because they are used at different points in the compilation process and fulfill different purposes.
The next logical question is: What, then, is an AST? Well, the purpose of parsing/syntactic analysis is to turn the series of tokens generated by the lexer into an AST (or parse tree). The AST is an intermediate representation that captures the relationship between syntactical elements in a way that is easier to work with programmatically. One way of thinking about this is that a text program is a one dimensional construct, and can only represent ideas as a sequence of elements, while the AST is freed from this constraint, and can represent the underlying relationships between those elements in 2 dimensions (as typically drawn), or any higher dimension space if you so choose to think about it that way.
For instance, a binary operator has two operands, let's call them A and B. In code, this may be spelled 'A * B' (assuming an infix operator - another advantage of an AST is to hide such distinctions that may be important syntactically, but not semantically), but for the compiler to "understand" this expression, it must read 5 characters sequentially, and this logic can quickly become cumbersome, given the many possibilities in even a small language. In an AST representation, however, we have a "binary operator" node whose value is '*', and that node has two children, values 'A' and 'B'.
As your compiler project progresses, I think you will begin to see the advantages of this representation.
In several modern programming languages (including C++, Java, and C#), the language allows integer overflow to occur at runtime without raising any kind of error condition.
For example, consider this (contrived) C# method, which does not account for the possibility of overflow/underflow. (For brevity, the method also doesn't handle the case where the specified list is a null reference.)
//Returns the sum of the values in the specified list.
private static int sumList(List<int> list)
{
int sum = 0;
foreach (int listItem in list)
{
sum += listItem;
}
return sum;
}
If this method is called as follows:
List<int> list = new List<int>();
list.Add(2000000000);
list.Add(2000000000);
int sum = sumList(list);
An overflow will occur in the sumList() method (because the int type in C# is a 32-bit signed integer, and the sum of the values in the list exceeds the value of the maximum 32-bit signed integer). The sum variable will have a value of -294967296 (not a value of 4000000000); this most likely is not what the (hypothetical) developer of the sumList method intended.
Obviously, there are various techniques that can be used by developers to avoid the possibility of integer overflow, such as using a type like Java's BigInteger, or the checked keyword and /checked compiler switch in C#.
However, the question that I'm interested in is why these languages were designed to by default allow integer overflows to happen in the first place, instead of, for example, raising an exception when an operation is performed at runtime that would result in an overflow. It seems like such behavior would help avoid bugs in cases where a developer neglects to account for the possibility of overflow when writing code that performs an arithmetic operation that could result in overflow. (These languages could have included something like an "unchecked" keyword that could designate a block where integer overflow is permitted to occur without an exception being raised, in those cases where that behavior is explicitly intended by the developer; C# actually does have this.)
Does the answer simply boil down to performance -- the language designers didn't want their respective languages to default to having "slow" arithmetic integer operations where the runtime would need to do extra work to check whether an overflow occurred, on every applicable arithmetic operation -- and this performance consideration outweighed the value of avoiding "silent" failures in the case that an inadvertent overflow occurs?
Are there other reasons for this language design decision as well, other than performance considerations?
In C#, it was a question of performance. Specifically, out-of-box benchmarking.
When C# was new, Microsoft was hoping a lot of C++ developers would switch to it. They knew that many C++ folks thought of C++ as being fast, especially faster than languages that "wasted" time on automatic memory management and the like.
Both potential adopters and magazine reviewers are likely to get a copy of the new C#, install it, build a trivial app that no one would ever write in the real world, run it in a tight loop, and measure how long it took. Then they'd make a decision for their company or publish an article based on that result.
The fact that their test showed C# to be slower than natively compiled C++ is the kind of thing that would turn people off C# quickly. The fact that your C# app is going to catch overflow/underflow automatically is the kind of thing that they might miss. So, it's off by default.
I think it's obvious that 99% of the time we want /checked to be on. It's an unfortunate compromise.
I think performance is a pretty good reason. If you consider every instruction in a typical program that increments an integer, and if instead of the simple op to add 1, it had to check every time if adding 1 would overflow the type, then the cost in extra cycles would be pretty severe.
You work under the assumption that integer overflow is always undesired behavior.
Sometimes integer overflow is desired behavior. One example I've seen is representation of an absolute heading value as a fixed point number. Given an unsigned int, 0 is 0 or 360 degrees and the max 32 bit unsigned integer (0xffffffff) is the biggest value just below 360 degrees.
int main()
{
uint32_t shipsHeadingInDegrees= 0;
// Rotate by a bunch of degrees
shipsHeadingInDegrees += 0x80000000; // 180 degrees
shipsHeadingInDegrees += 0x80000000; // another 180 degrees, overflows
shipsHeadingInDegrees += 0x80000000; // another 180 degrees
// Ships heading now will be 180 degrees
cout << "Ships Heading Is" << (double(shipsHeadingInDegrees) / double(0xffffffff)) * 360.0 << std::endl;
}
There are probably other situations where overflow is acceptable, similar to this example.
C/C++ never mandate trap behaviour. Even the obvious division by 0 is undefined behaviour in C++, not a specified kind of trap.
The C language doesn't have any concept of trapping, unless you count signals.
C++ has a design principle that it doesn't introduce overhead not present in C unless you ask for it. So Stroustrup would not have wanted to mandate that integers behave in a way which requires any explicit checking.
Some early compilers, and lightweight implementations for restricted hardware, don't support exceptions at all, and exceptions can often be disabled with compiler options. Mandating exceptions for language built-ins would be problematic.
Even if C++ had made integers checked, 99% of programmers in the early days would have turned if off for the performance boost...
Because checking for overflow takes time. Each primitive mathematical operation, which normally translates into a single assembly instruction would have to include a check for overflow, resulting in multiple assembly instructions, potentially resulting in a program that is several times slower.
It is likely 99% performance. On x86 would have to check the overflow flag on every operation which would be a huge performance hit.
The other 1% would cover those cases where people are doing fancy bit manipulations or being 'imprecise' in mixing signed and unsigned operations and want the overflow semantics.
Backwards compatibility is a big one. With C, it was assumed that you were paying enough attention to the size of your datatypes that if an over/underflow occurred, that that was what you wanted. Then with C++, C# and Java, very little changed with how the "built-in" data types worked.
If integer overflow is defined as immediately raising a signal, throwing an exception, or otherwise deflecting program execution, then any computations which might overflow will need to be performed in the specified sequence. Even on platforms where integer overflow checking wouldn't cost anything directly, the requirement that integer overflow be trapped at exactly the right point in a program's execution sequence would severely impede many useful optimizations.
If a language were to specify that integer overflows would instead set a latching error flag, were to limit how actions on that flag within a function could affect its value within calling code, and were to provide that the flag need not be set in circumstances where an overflow could not result in erroneous output or behavior, then compilers could generate more efficient code than any kind of manual overflow-checking programmers could use. As a simple example, if one had a function in C that would multiply two numbers and return a result, setting an error flag in case of overflow, a compiler would be required to perform the multiplication whether or not the caller would ever use the result. In a language with looser rules like I described, however, a compiler that determined that nothing ever uses the result of the multiply could infer that overflow could not affect a program's output, and skip the multiply altogether.
From a practical standpoint, most programs don't care about precisely when overflows occur, so much as they need to guarantee that they don't produce erroneous results as a consequence of overflow. Unfortunately, programming languages' integer-overflow-detection semantics have not caught up with what would be necessary to let compilers produce efficient code.
My understanding of why errors would not be raised by default at runtime boils down to the legacy of desiring to create programming languages with ACID-like behavior. Specifically, the tenet that anything that you code it to do (or don't code), it will do (or not do). If you didn't code some error handler, then the machine will "assume" by virtue of no error handler, that you really want to do the ridiculous, crash-prone thing you're telling it to do.
(ACID reference: http://en.wikipedia.org/wiki/ACID)
The word seems to get used in a number of contexts. The best I can figure is that they mean a variable that can't change. Isn't that what constants/finals (darn you Java!) are for?
An invariant is more "conceptual" than a variable. In general, it's a property of the program state that is always true. A function or method that ensures that the invariant holds is said to maintain the invariant.
For instance, a binary search tree might have the invariant that for every node, the key of the node's left child is less than the node's own key. A correctly written insertion function for this tree will maintain that invariant.
As you can tell, that's not the sort of thing you can store in a variable: it's more a statement about the program. By figuring out what sort of invariants your program should maintain, then reviewing your code to make sure that it actually maintains those invariants, you can avoid logical errors in your code.
It is a condition you know to always be true at a particular place in your logic and can check for when debugging to work out what has gone wrong.
The magic of wikipedia: Invariant (computer science)
In computer science, a predicate that,
if true, will remain true throughout a
specific sequence of operations, is
called (an) invariant to that
sequence.
This answer is for my 5 year old kid. Do not think of an invariant as a constant or fixed numerical value. But it can be. However, it is more than that.
Rather, an invariant is something like of a fixed relationship between varying entities. For example, your age will always be less than that compared to your biological parents. Both your age, and your parent's age changes in the passage of time, but the relationship that i mentioned above is an invariant.
An invariant can also be a numerical constant. For example, the value of pi is an invariant ratio between the circle's circumference over its diameter. No matter how big or small the circle is, that ratio will always be pi.
I usually view them more in terms of algorithms or structures.
For example, you could have a loop invariant that could be asserted--always true at the beginning or end of each iteration. That is, if your loop was supposed to process a collection of objects from one stack to another, you could say that |stack1|+|stack2|=c, at the top or bottom of the loop.
If the invariant check failed, it would indicate something went wrong. In this example, it could mean that you forgot to push the processed element onto the final stack, etc.
As this line states:
In computer science, a predicate that, if true, will remain true throughout a specific sequence of operations, is called (an) invariant to that sequence.
To better understand this hope this example in C++ helps.
Consider a scenario where you have to get some values and get the total count of them in a variable called as count and add them in a variable called as sum
The invariant (again it's more like a concept):
// invariant:
// we have read count grades so far, and
// sum is the sum of the first count grades
The code for the above would be something like this,
int count=0;
double sum=0,x=0;
while (cin >> x) {
++count;
sum+=x;
}
What the above code does?
1) Reads the input from cin and puts them in x
2) After one successful read, increment count and sum = sum + x
3) Repeat 1-2 until read stops ( i.e ctrl+D)
Loop invariant:
The invariant must be True ALWAYS. So initially you start out your code with just this
while(cin>>x){
}
This loop reads data from standard input and stores in x. Well and good. But the invariant becomes false because the first part of our invariant wasn't followed (or kept true).
// we have read count grades so far, and
How to keep the invariant true?
Simple! increment count.
So ++count; would do good!. Now our code becomes something like this,
while(cin>>x){
++count;
}
But
Even now our invariant (a concept which must be TRUE) is False because now we didn't satisfy the second part of our invariant.
// sum is the sum of the first count grades
So what to do now?
Add x to sum and store it in sum ( sum+=x) and the next time
cin>>x will read a new value into x.
Now our code becomes something like this,
while(cin>>x){
++count;
sum+=x;
}
Let's check
Whether code matches our invariant
// invariant:
// we have read count grades so far, and
// sum is the sum of the first count grades
code:
while(cin>>x){
++count;
sum+=x;
}
Ah!. Now the loop invariant is True always and code works fine.
The above example was taken and modified from the book Accelerated C++ by Andrew-koening and Barbara-E
Something that doesn't change within a block of code
All the answers here are great, but i felt that i can shed more light on the matter:
Invariant from a language point of view means something that never changes. The concept though comes actually from math, it's one of the popular proof techniques when combined with induction.
Here is how a proof goes, If you can find an invariant that is in the initial state, And that this invariant persists regardless of any [legal] transformation applied to the state, then you can prove that If a certain state does not have this invariant then it can never occur, no matter what sequence of transformations are applied to the initial state.
Now the previous way of thinking (again combined with induction) makes it possible to predicate the logic of computer software. Especially important when the execution goes in loops, in which an invariant can be used to prove that a certain loop will yield a certain result or that it will never change the state of a program in a certain way.
When invariant is used to predicate a loop logic its called loop invariant. It can be used outside loops, but for loops it is really important, because you often have a lot of possibilities, or an infinite number of possibilities.
Notice that i use the word "predicate" the logic of a computer software, and not prove. And that's because while in math invariant can be used as a proof, it can never prove that the computer software when executed will yield what is expected, due to the fact that the software is executed on top of many abstractions, that can never be proved that they will yield what is expected (think of the hardware abstraction for example).
Finally while theoretically and rigorously predicting software logic is only important for high critical applications like Medical, and Military ones. Invariant can still be used to aid the typical programmer when debugging. It can be used to know where at a certain location The program failed because it has failed to maintain a certain invariant - many of us use it anyway without giving a thought about it.
Class Invariant
Class Invariant is a condition which should be always true before and after calling relevant function
For example balanced tree has an Invariant which is called isBalanced. When you modify your tree through some methods (e.g. addNode, removeNode...) - isBalanced should be always true before and after modifying the tree
Following on from what it is, invariants are quite useful in writing clean code, since knowing conceptually what invariants should be present in your code allows you to easily decide how to organize your code to reach those aims. As mentioned ealier, they're also useful in debugging, as checking to see if the invariant's being maintained is often a good way of seeing if whatever manipulation you're attempting to perform is actually doing what you want it to.
It's typically a quantity that does not change under certain mathematical operations.
An example is a scalar, which does not change under rotations. In magnetic resonance imaging, for example, it is useful to characterize a tissue property by a rotational invariant, because then its estimation ideally does not depend on the orientation of the body in the scanner.
The ADT invariant specifes relationships
among the data fields (instance variables)
that must always be true before and after
the execution of any instance method.
There is an excellent example of an invariant and why it matters in the book Java Concurrency in Practice.
Although Java-centric, the example describes some code that is responsible for calculating the factors of a provided integer. The example code attempts to cache the last number provided, and the factors that were calculated to improve performance. In this scenario there is an invariant that was not accounted for in the example code which has left the code susceptible to race conditions in a concurrent scenario.