Creating a logic gate simulator - boolean-logic

I need to make an application for creating logic circuits and seeing the results. This is primarily for use in A-Level (UK, 16-18 year olds generally) computing courses.
Ive never made any applications like this, so am not sure on the best design for storing the circuit and evaluating the results (at a resomable speed, say 100Hz on a 1.6Ghz single core computer).
Rather than have the circuit built from the basic gates (and, or, nand, etc) I want to allow these gates to be used to make "chips" which can then be used within other circuits (eg you might want to make a 8bit register chip, or a 16bit adder).
The problem is that the number of gates increases massively with such circuits, such that if the simulation worked on each individual gate it would have 1000's of gates to simulate, so I need to simplify these components that can be placed in a circuit so they can be simulated quickly.
I thought about generating a truth table for each component, then simulation could use a lookup table to find the outputs for a given input. The problem occurred to me though that the size of such tables increase massively with inputs. If a chip had 32 inputs, then the truth table needs 2^32 rows. This uses a massive amount of memory in many cases more than there is to use so isn't practical for non-trivial components, it also wont work with chips that can store their state (eg registers) since they cant be represented as a simply table of inputs and outputs.
I know I could just hardcode things like register chips, however since this is for educational purposes I want it so that people can make their own components as well as view and edit the implementations for standard ones. I considered allowing such components to be created and edited using code (eg dlls or a scripting language), so that an adder for example could be represented as "output = inputA + inputB" however that assumes that the students have done enough programming in the given language to be able to understand and write such plugins to mimic the results of their circuit which is likly to not be the case...
Is there some other way to take a boolean logic circuit and simplify it automatically so that the simulation can determine the outputs of a component quickly?
As for storing the components I was thinking of storing some kind of tree structure, such that each component is evaluated once all components that link to its inputs are evaluated.
eg consider: A.B + C
The simulator would first evaluate the AND gate, and then evaluate the OR gate using the output of the AND gate and C.
However it just occurred to me that in cases where the outputs link back round to the inputs, will cause a deadlock because there inputs will never all be evaluated...How can I overcome this, since the program can only evaluate one gate at a time?

Have you looked at Richard Bowles's simulator?

You're not the first person to want to build their own circuit simulator ;-).
My suggestion is to settle on a minimal set of primitives. When I began mine (which I plan to resume one of these days...) I had two primitives:
Source: zero inputs, one output that's always 1.
Transistor: two inputs A and B, one output that's A and not B.
Obviously I'm misusing the terminology a bit, not to mention neglecting the niceties of electronics. On the second point I recommend abstracting to wires that carry 1s and 0s like I did. I had a lot of fun drawing diagrams of gates and adders from these. When you can assemble them into circuits and draw a box round the set (with inputs and outputs) you can start building bigger things like multipliers.
If you want anything with loops you need to incorporate some kind of delay -- so each component needs to store the state of its outputs. On every cycle you update all the new states from the current states of the upstream components.
Edit Regarding your concerns on scalability, how about defaulting to the first principles method of simulating each component in terms of its state and upstream neighbours, but provide ways of optimising subcircuits:
If you have a subcircuit S with inputs A[m] with m < 8 (say, giving a maximum of 256 rows) and outputs B[n] and no loops, generate the truth table for S and use that. This could be done automatically for identified subcircuits (and reused if the subcircuit appears more than once) or by choice.
If you have a subcircuit with loops, you may still be able to generate a truth table. There are fixed-point finding methods which can help here.
If your subcircuit has delays (and they are significant to the enclosing circuit) the truth table can incorporate state columns. E.g. if the subcircuit has input A, inner state B, and output C, where C <- A and B, B <- A, the truth table could be:
A B | B C
0 0 | 0 0
0 1 | 0 0
1 0 | 1 0
1 1 | 1 1
If you have a subcircuit that the user asserts implements a particular known pattern such as "adder", provide an option for using a hard-coded implementation for updating that subcircuit instead of by simulating its inner parts.

When I made a circuit emulator (sadly, also incomplete and also unreleased), here's how I handled loops:
Each circuit element stores its boolean value
When an element "E0" changes its value, it notifies (via the observer pattern) all who depend on it
Each observing element evaluates its new value and does likewise
When the E0 change occurs, a level-1 list is kept of all elements affected. If an element already appears on this list, it gets remembered in a new level-2 list but doesn't continue to notify its observers. When the sequence which E0 began has stopped notifying new elements, the next queue level is handled. Ie: the sequence is followed and completed for the first element added to level-2, then the next added to level-2, etc. until all of level-x is complete, then you move to level-(x+1)
This is in no way complete. If you ever have multiple oscillators doing infinite loops, then no matter what order you take them in, one could prevent the other from ever getting its turn. My next goal was to alleviate this by limiting steps with clock-based sync'ing instead of cascading combinatorials, but I never got this far in my project.

You might want to take a look at the From Nand To Tetris in 12 steps course software. There is a video talking about it on youtube.
The course page is at: http://www1.idc.ac.il/tecs/

If you can disallow loops (outputs linking back to inputs), then you can significantly simplify the problem. In that case, for every input there will be exactly one definite output. Cycles however can make the output undecideable (or rather, constantly changing).
Evaluating a circuit without loops should be easy - just use the BFS algorithm with "junctions" (connections between logic gates) as the items in the list. Start off with all the inputs to all the gates in an "undefined" state. As soon as a gate has all inputs "defined" (either 1 or 0), calculate its output and add its output junctions to the BFS list. This way you only have to evaluate each gate and each junction once.
If there are loops, the same algorithm can be used, but the circuit can be built in such a way that it never comes to a "rest" and some junctions are always changing between 1 and 0.
OOps, actually, this algorithm can't be used in this case because the looped gates (and gates depending on them) would forever stay as "undefined".

You could introduce them to the concept of Karnaugh maps, which would help them simplify truth values for themselves.

You could hard code all the common ones. Then allow them to build their own out of the hard coded ones (which would include low level gates), which would be evaluated by evaluating each sub-component. Finally, if one of their "chips" has less than X inputs/outputs, you could "optimize" it into a lookup table. Maybe detect how common it is and only do this for the most used Y chips? This way you have a good speed/space tradeoff.
You could always JIT compile the circuits...
As I haven't really thought about it, I'm not really sure what approach I'd take.. but it would possibly be a hybrid method and I'd definitely hard code popular "chips" in too.

When I was playing around making a "digital circuit" simulation environment, I had each defined circuit (a basic gate, a mux, a demux and a couple of other primitives) associated with a transfer function (that is, a function that computes all outputs, based on the present inputs), an "agenda" structure (basically a linked list of "when to activate a specific transfer function), virtual wires and a global clock.
I arbitrarily set the wires to hard-modify the inputs whenever the output changed and the act of changing an input on any circuit to schedule a transfer function to be called after the gate delay. With this at hand, I could accommodate both clocked and unclocked circuit elements (a clocked element is set to have its transfer function run at "next clock transition, plus gate delay", any unclocked element just depends on the gate delay).
Never really got around to build a GUI for it, so I've never released the code.

Related

LSTM Evolution Forecast

I have a confusion about the way the LSTM networks work when forecasting with an horizon that is not finite but I'm rather searching for a prediction in whatever time in future. In physical terms I would call it the evolution of the system.
Suppose I have a time series $y(t)$ (output) I want to forecast, and some external inputs $u_1(t), u_2(t),\cdots u_N(t)$ on which the series $y(t)$ depends.
It's common to use the lagged value of the output $y(t)$ as input for the network, such that I schematically have something like (let's consider for simplicity just lag 1 for the output and no lag for the external input):
[y(t-1), u_1(t), u_2(t),\cdots u_N(t)] \to y(t)
In this way of thinking the network, when one wants to do recursive forecast it is forced to use the predicted value at the previous step as input for the next step. In this way we have an effect of propagation of error that makes the long term forecast badly behaving.
Now, my confusion is, I'm thinking as a RNN as a kind of an (simple version) implementation of a state space model where I have the inputs, my output and one or more state variable responsible for the memory of the system. These variables are hidden and not observed.
So now the question, if there is this kind of variable taking already into account previous states of the system why would I need to use the lagged output value as input of my network/model ?
Getting rid of this does my long term forecast would be better, since I'm not expecting anymore the propagation of the error of the forecasted output. (I guess there will be anyway an error in the internal state propagating)
Thanks !
Please see DeepAR - a LSTM forecaster more than one step into the future.
The main contributions of the paper are twofold: (1) we propose an RNN
architecture for probabilistic forecasting, incorporating a negative
Binomial likelihood for count data as well as special treatment for
the case when the magnitudes of the time series vary widely; (2) we
demonstrate empirically on several real-world data sets that this
model produces accurate probabilistic forecasts across a range of
input characteristics, thus showing that modern deep learning-based
approaches can effective address the probabilistic forecasting
problem, which is in contrast to common belief in the field and the
mixed results
In this paper, they forecast multiple steps into the future, to negate exactly what you state here which is the error propagation.
Skipping several steps allows to get more accurate predictions, further into the future.
One more thing done in this paper is predicting percentiles, and interpolating, rather than predicting the value directly. This adds stability, and an error assessment.
Disclaimer - I read an older version of this paper.

Indirect Kalman Filter for Inertial Navigation System

I'm trying to implement an Inertial Navigation System using an Indirect Kalman Filter. I've found many publications and thesis on this topic, but not too much code as example. For my implementation I'm using the Master Thesis available at the following link:
https://fenix.tecnico.ulisboa.pt/downloadFile/395137332405/dissertacao.pdf
As reported at page 47, the measured values from inertial sensors equal the true values plus a series of other terms (bias, scale factors, ...).
For my question, let's consider only bias.
So:
Wmeas = Wtrue + BiasW (Gyro meas)
Ameas = Atrue + BiasA. (Accelerometer meas)
Therefore,
when I propagate the Mechanization equations (equations 3-29, 3-37 and 3-41)
I should use the "true" values, or better:
Wmeas - BiasW
Ameas - BiasA
where BiasW and BiasA are the last available estimation of the bias. Right?
Concerning the update phase of the EKF,
if the measurement equation is
dzV = VelGPS_est - VelGPS_meas
the H matrix should have an identity matrix in corrispondence of the velocity error state variables dx(VEL) and 0 elsewhere. Right?
Said that I'm not sure how I have to propagate the state variable after update phase.
The propagation of the state variable should be (in my opinion):
POSk|k = POSk|k-1 + dx(POS);
VELk|k = VELk|k-1 + dx(VEL);
...
But this didn't work. Therefore I've tried:
POSk|k = POSk|k-1 - dx(POS);
VELk|k = VELk|k-1 - dx(VEL);
that didn't work too... I tried both solutions, even if in my opinion the "+" should be used. But since both don't work (I have some other error elsewhere)
I would ask you if you have any suggestions.
You can see a snippet of code at the following link: http://pastebin.com/aGhKh2ck.
Thanks.
The difficulty you're running into is the difference between the theory and the practice. Taking your code from the snippet instead of the symbolic version in the question:
% Apply corrections
Pned = Pned + dx(1:3);
Vned = Vned + dx(4:6);
In theory when you use the Indirect form you are freely integrating the IMU (that process called the Mechanization in that paper) and occasionally running the IKF to update its correction. In theory the unchecked double integration of the accelerometer produces large (or for cheap MEMS IMUs, enormous) error values in Pned and Vned. That, in turn, causes the IKF to produce correspondingly large values of dx(1:6) as time evolves and the unchecked IMU integration runs farther and farther away from the truth. In theory you then sample your position at any time as Pned +/- dx(1:3) (the sign isn't important -- you can set that up either way). The important part here is that you are not modifying Pned from the IKF because both are running independent from each other and you add them together when you need the answer.
In practice you do not want to take the difference between two enourmous double values because you will lose precision (because many of the bits of the significand were needed to represent the enormous part instead of the precision you want). You have grasped that in practice you want to recursively update Pned on each update. However, when you diverge from the theory this way, you have to take the corresponding (and somewhat unobvious) step of zeroing out your correction value from the IKF state vector. In other words, after you do Pned = Pned + dx(1:3) you have "used" the correction, and you need to balance the equation with dx(1:3) = dx(1:3) - dx(1:3) (simplified: dx(1:3) = 0) so that you don't inadvertently integrate the correction over time.
Why does this work? Why doesn't it mess up the rest of the filter? As it turns out, the KF process covariance P does not actually depend on the state x. It depends on the update function and the process noise Q and so on. So the filter doesn't care what the data is. (Now that's a simplification, because often Q and R include rotation terms, and R might vary based on other state variables, etc, but in those cases you are actually using state from outside the filter (the cumulative position and orientation) not the raw correction values, which have no meaning by themselves).

What is an Abstract Syntax Tree/Is it needed?

I've been interested in compiler/interpreter design/implementation for as long as I've been programming (only 5 years now) and it's always seemed like the "magic" behind the scenes that nobody really talks about (I know of at least 2 forums for operating system development, but I don't know of any community for compiler/interpreter/language development). Anyways, recently I've decided to start working on my own, in hopes to expand my knowledge of programming as a whole (and hey, it's pretty fun :). So, based off the limited amount of reading material I have, and Wikipedia, I've developed this concept of the components for a compiler/interpreter:
Source code -> Lexical Analysis -> Abstract Syntax Tree -> Syntactic Analysis -> Semantic Analysis -> Code Generation -> Executable Code.
(I know there's more to code generation and executable code, but I haven't gotten that far yet :)
And with that knowledge, I've created a very basic lexer (in Java) to take input from a source file, and output the tokens into another file. A sample input/output would look like this:
Input:
int a := 2
if(a = 3) then
print "Yay!"
endif
Output (from lexer):
INTEGER
A
ASSIGN
2
IF
L_PAR
A
COMP
3
R_PAR
THEN
PRINT
YAY!
ENDIF
Personally, I think it would be really easy to go from there to syntactic/semantic analysis, and possibly even code generation, which leads me to question: Why use an AST, when it seems that my lexer is doing just as good a job? However, 100% of my sources I use to research this topic all seem adamant that this is a necessary part of any compiler/interpreter. Am I missing the point of what an AST really is (a tree that shows the logical flow of a program)?
TL;DR: Currently in route to develop a compiler, finished the lexer, seems to me like the output would make for easy syntactic analysis/semantic analysis, rather than doing an AST. So why use one? Am I missing the point of one?
Thanks!
First off, one thing about your list of components does not make sense. Building an AST is (pretty much) the syntactic analysis, so it either shouldn't be in there, or at least come before the AST.
What you got there is a lexer. All it gives you are individual tokens. In any case, you will need an actual parser, because regular languages aren't any fun to program in. You can't even (properly) nest expressions. Heck, you can't even handle operator precedence. A token stream doesn't give you:
An idea where statements and expressions start and end.
An idea how statements are grouped into blocks.
An idea Which part of the expression has which precedence, associativity, etc.
A clear, uncluttered view at the actual structure of the program.
A structure which can be passed through a myriad of transformations, without every single pass knowing and having code to accomodate that the condition in an if is enclosed by parentheses.
... more generally, any kind of comprehension above the level of a single token.
Suppose you have two passes in your compiler which optimize certain kinds of operators applies to certain arguments (say, constant folding and algebraic simplifications like x - x -> 0). If you hand them tokens for the expression x - x * 1, these passes are cluttered with figuring out that the x * 1 part comes first. And they have to know that, lest the transformation is incorrect (consider 1 + 2 * 3).
These things are tricky enough to get right as it is, so you don't want to be pestered by parsing problems as well. That's why you solve the parsing problem first, in a separate parsing step. Then you can, say, replace a function call with its definition, without worrying about adding parenthesis so the meaning remains the same. You save time, you separate concerns, you avoid repetition, you enable simpler code in many other places, etc.
A parser figures all that out, and builds an AST which consequently holds all that information. Without any further data on the nodes, the shape of the AST alone gives you no. 1, 2, 3, and much more, for free. None of the bazillion passes that follow have to worry about it anymore.
That's not to say you always have to have an AST. For sufficiently simple languages, you can do a single-pass compiler. Instead of generating an AST or some other intermediate representation during parsing, you emit code as you go. However, this becomes harder for less simple languages and you can't reasonably do a lot of stuff (such as 70% of all optimizations and diagnostics -- and yes I just made that number up). Generally, I wouldn't advise you to do this. There are good reasons single-pass compilers are mostly dead. Even languages which permit them (e.g. C) are nowadays implemented with multiple passes and ASTs. It's a simple way to get started, but will severely limit you (and the language, if you design it) later.
You've got the AST at the wrong point in your flow diagram. Typically, the output of the lexer is a series of tokens (as you have in your output), and these are fed to the parser/syntactic analyzer, which generates the AST. So the output of your lexer is different from an AST because they are used at different points in the compilation process and fulfill different purposes.
The next logical question is: What, then, is an AST? Well, the purpose of parsing/syntactic analysis is to turn the series of tokens generated by the lexer into an AST (or parse tree). The AST is an intermediate representation that captures the relationship between syntactical elements in a way that is easier to work with programmatically. One way of thinking about this is that a text program is a one dimensional construct, and can only represent ideas as a sequence of elements, while the AST is freed from this constraint, and can represent the underlying relationships between those elements in 2 dimensions (as typically drawn), or any higher dimension space if you so choose to think about it that way.
For instance, a binary operator has two operands, let's call them A and B. In code, this may be spelled 'A * B' (assuming an infix operator - another advantage of an AST is to hide such distinctions that may be important syntactically, but not semantically), but for the compiler to "understand" this expression, it must read 5 characters sequentially, and this logic can quickly become cumbersome, given the many possibilities in even a small language. In an AST representation, however, we have a "binary operator" node whose value is '*', and that node has two children, values 'A' and 'B'.
As your compiler project progresses, I think you will begin to see the advantages of this representation.

Purity vs Referential transparency

The terms do appear to be defined differently, but I've always thought of one implying the other; I can't think of any case when an expression is referentially transparent but not pure, or vice-versa.
Wikipedia maintains separate articles for these concepts and says:
From Referential transparency:
If all functions involved in the
expression are pure functions, then
the expression is referentially
transparent. Also, some impure
functions can be included in the
expression if their values are
discarded and their side effects are
insignificant.
From Pure expressions:
Pure functions are required to
construct pure expressions. [...] Pure
expressions are often referred to as
being referentially transparent.
I find these statements confusing. If the side effects from a so-called "impure function" are insignificant enough to allow not performing them (i.e. replace a call to such a function with its value) without materially changing the program, it's the same as if it were pure in the first place, isn't it?
Is there a simpler way to understand the differences between a pure expression and a referentially transparent one, if any? If there is a difference, an example expression that clearly demonstrates it would be appreciated.
If I gather in one place any three theorists of my acquaintance, at least two of them disagree on the meaning of the term "referential transparency." And when I was a young student, a mentor of mine gave me a paper explaining that even if you consider only the professional literature, the phrase "referentially transparent" is used to mean at least three different things. (Unfortunately that paper is somewhere in a box of reprints that have yet to be scanned. I searched Google Scholar for it but I had no success.)
I cannot inform you, but I can advise you to give up: Because even the tiny cadre of pointy-headed language theorists can't agree on what it means, the term "referentially transparent" is not useful. So don't use it.
P.S. On any topic to do with the semantics of programming languages, Wikipedia is unreliable. I have given up trying to fix it; the Wikipedian process seems to regard change and popular voting over stability and accuracy.
All pure functions are necessarily referentially transparent. Since, by definition, they cannot access anything other than what they are passed, their result must be fully determined by their arguments.
However, it is possible to have referentially transparent functions which are not pure. I can write a function which is given an int i, then generates a random number r, subtracts r from itself and places it in s, then returns i - s. Clearly this function is impure, because it is generating random numbers. However, it is referentially transparent. In this case, the example is silly and contrived. However, in, e.g., Haskell, the id function is of type a - > a whereas my stupidId function would be of type a -> IO a indicating that it makes use of side effects. When a programmer can guarantee through means of an external proof that their function is actually referentially transparent, then they can use unsafePerformIO to strip the IO back away from the type.
I'm somewhat unsure of the answer I give here, but surely somebody will point us in some direction. :-)
"Purity" is generally considered to mean "lack of side-effects". An expression is said to be pure if its evaluation lacks side-effects. What's a side-effect then? In a purely functional language, side-effect is anything that doesn't go by the simple beta-rule (the rule that to evaluate function application is the same as to substitute actual parameter for all free occurrences of the formal parameter).
For example, in a functional language with linear (or uniqueness, this distinction shouldn't bother at this moment) types some (controlled) mutation is allowed.
So I guess we have sorted out what "purity" and "side-effects" might be.
Referential transparency (according to the Wikipedia article you cited) means that variable can be replaced by the expression it denotes (abbreviates, stands for) without changing the meaning of the program at hand (btw, this is also a hard question to tackle, and I won't attempt to do so here). So, "purity" and "referential transparency" are indeed different things: "purity" is a property of some expression roughly means "doesn't produce side-effects when executed" whereas "referential transparency" is a property relating variable and expression that it stands for and means "variable can be replaced with what it denotes".
Hopefully this helps.
These slides from one ACCU2015 talk have a great summary on the topic of referential transparency.
From one of the slides:
A language is referentially transparent if (a)
every subexpression can be replaced by any other
that’s equal to it in value and (b) all occurrences of
an expression within a given context yield the
same value.
You can have, for instance, a function that logs its computation to the program standard output (so, it won't be a pure function), but you can replace calls for this function by a similar function that doesn't log its computation. Therefore, this function have the referential transparency property. But... the above definition is about languages, not expressions, as the slides emphasize.
[...] it's the same as if it were pure in the first place, isn't it?
From the definitions we have, no, it is not.
Is there a simpler way to understand the differences between a pure expression and a referentially transparent one, if any?
Try the slides I mentioned above.
The nice thing about standards is that there are so many of them to choose
from.
Andrew S. Tanenbaum.
...along with definitions of referential transparency:
from page 176 of Functional programming with Miranda by Ian Holyer:
8.1 Values and Behaviours
The most important property of the semantics of a pure functional language is that the declarative and operational views of the language coincide exactly, in the following way:
Every expression denotes a value, and there are valuescorresponding to all possible program behaviours. Thebehaviour produced by an expression in any context is completely determined by its value, and vice versa.
This principle, which is usually rather opaquely called referential transparency, can also be pictured in the following way:
and from Nondeterminism with Referential Transparency in Functional Programming Languages by F. Warren Burton:
[...] the property that an expression always has the same value in the same environment [...]
...for various other definitions, see Referential Transparency, Definiteness and Unfoldability by Harald Søndergaard and Peter Sestoft.
Instead, we'll begin with the concept of "purity". For the three of you who didn't know it already, the computer or device you're reading this on is a solid-state Turing machine, a model of computing intrinsically connected with effects. So every program, functional or otherwise, needs to use those effects To Get Things DoneTM.
What does this mean for purity? At the assembly-language level, which is the domain of the CPU, all programs are impure. If you're writing a program in assembly language, you're the one who is micro-managing the interplay between all those effects - and it's really tedious!
Most of the time, you're just instructing the CPU to move data around in the computer's memory, which only changes the contents of individual memory locations - nothing to see there! It's only when your instructions direct the CPU to e.g. write to video memory, that you observe a visible change (text appearing on the screen).
For our purposes here, we'll split effects into two coarse categories:
those involving I/O devices like screens, speakers, printers, VR-headsets, keyboards, mice, etc; commonly known as observable effects.
and the rest, which only ever change the contents of memory.
In this situation, purity just means the absence of those observable effects, the ones which cause a visible change to the environment of the running program, maybe even its host computer. It is definitely not the absence of all effects, otherwise we would have to replace our solid-state Turing machines!
Now for the question of 42 life, the Universe and everything what exactly is meant by the term "referential transparency" - instead of herding cats trying to bring theorists into agreement, let's just try to find the original meaning given to the term. Fortunately for us, the term frequently appears in the context of I/O in Haskell - we only need a relevant article...here's one: from the first page of Owen Stephen's Approaches to Functional I/O:
Referential transparency refers to the ability to replace a sub-expression with one of equal value, without changing the value of the outer expression. Originating from Quine the term was introduced to Computer Science by Strachey.
Following the references:
From page 9 of 39 in Christopher Strachey's Fundamental Concepts in Programming Languages:
One of the most useful properties of expressions is that called by Quine referential transparency. In essence this means that if we wish to find the value of an expression which contains a sub-expression, the only thing we need to know about the sub-expression is its value. Any other features of the sub-expression, such as its internal structure, the number and nature of its components, the order in which they are evaluated or the colour of the ink in which they are written, are irrelevant to the value of the main expression.
From page 163 of 314 in Willard Van Ormond Quine's Word and Object:
[...] Quotation, which thus interrupts the referential force of a term, may be said to fail of referential transparency2. [...] I call a mode of confinement Φ referentially transparent if, whenever an occurrence of a singular term t is purely referential in a term or sentence ψ(t), it is purely referential also in the containing term or sentence Φ(ψ(t)).
with the footnote:
2 The term is from Whitehead and Russell, 2d ed., vol. 1, p. 665.
Following that reference:
From page 709 of 719 in Principa Mathematica by Alfred North Whitehead and Bertrand Russell:
When an assertion occurs, it is made by means of a particular fact, which is an instance of the proposition asserted. But this particular fact is, so to speak, "transparent"; nothing is said about it, bit by means of it something is said about something else. It is the "transparent" quality which belongs to propositions as they occur in truth-functions.
Let's try to bring all that together:
Whitehead and Russell introduce the term "transparent";
Quine then defines the qualified term "referential transparency";
Strachey then adapts Quine's definition in defining the basics of programming languages.
So it's a choice between Quine's original or Strachey's adapted definition. You can try translating Quine's definition for yourself if you like - everyone who's ever contested the definition of "purely functional" might even enjoy the chance to debate something different like what "mode of containment" and "purely referential" really means...have fun! The rest of us will just accept that Strachey's definition is a little vague ("In essence [...]") and continue on:
One useful property of expressions is referential transparency. In essence this means that if we wish to find the value of an expression which contains a sub-expression,
the only thing we need to know about the sub-expression is its value. Any other features of the sub-expression, such as its internal structure, the number and nature of
its components, the order in which they are evaluated or the colour of the ink in which they are written, are irrelevant to the value of the main expression.
(emphasis by me.)
Regarding that description ("that if we wish to find the value of [...]"), a similar, but more concise statement is given by Peter Landin in The Next 700 Programming Languages:
the thing an expression denotes, i.e., its "value", depends only on the values of its sub-expressions, not on other properties of them.
Thus:
One useful property of expressions is referential transparency. In essence this means the thing an expression denotes, i.e., its "value", depends only on the values of its sub-expressions, not on other properties of them.
Strachey provides some examples:
(page 12 of 39)
We tend to assume automatically that the symbol x in an expression such as 3x2 + 2x + 17 stands for the same thing (or has the same value) on each occasion it occurs. This is the most important consequence of referential transparency and it is only in virtue of this property that we can use the where-clauses or λ-expressions described in the last section.
(and on page 16)
When the function is used (or called or applied) we write f[ε] where ε can be an expression. If we are using a referentially transparent language all we require to know about the expression ε in order to evaluate f[ε] is its value.
So referential transparency, by Strachey's original definition, implies purity - in the absence of an order of evaluation, observable and other effects are practically useless...
I'll quote John Mitchell Concept in programming language. He defines pure functional language has to pass declarative language test which is free from side-effects or lack of side effects.
"Within the scope of specific deceleration of x1,...,xn , all occurrence of an expression e containing only variables x1,...,xn have the same value."
In linguistics a name or noun phrase is considered referentially transparent if it may be replaced with the another noun phrase with same referent without changing the meaning of the sentence it contains.
Which in 1st case holds but in 2nd case it gets too weird.
Case 1:
"I saw Walter get into his new car."
And if Walter own a Centro then we could replace that in the given sentence as:
"I saw Walter get into his Centro"
Contrary to first :
Case #2 : He was called William Rufus because of his read beard.
Rufus means somewhat red and reference was to William IV of England.
"He was called William IV because of his read beard." looks too awkward.
Traditional way to say is, a language is referentially transparent if we may replace one expression with another of equal value anywhere in the program without changing the meaning of the program.
So, referential transparency is a property of pure functional language.
And if your program is free from side effects then this property will hold.
So give it up is awesome advice but get it on might also look good in this context.
Pure functions are those that return the same value on every call, and do not have side effects.
Referential transparency means that you can replace a bound variable with its value and still receive the same output.
Both pure and referentially transparent:
def f1(x):
t1 = 3 * x
t2 = 6
return t1 + t2
Why is this pure?
Because it is a function of only the input x and has no side-effects.
Why is this referentially transparent?
You could replace t1 and t2 in f1 with their respective right hand sides in the return statement, as follows
def f2(x):
return 3 * x + 6
and f2 will still always return the same result as f1 in every case.
Pure, but not referentially transparent:
Let's modify f1 as follows:
def f3(x):
t1 = 3 * x
t2 = 6
x = 10
return t1 + t2
Let us try the same trick again by replacing t1 and t2 with their right hand sides, and see if it is an equivalent definition of f3.
def f4(x):
x = 10
return 3 * x + 6
We can easily observe that f3 and f4 are not equivalent on replacing variables with their right hand sides / values. f3(1) would return 9 and f4(1) would return 36.
Referentially transparent, but not pure:
Simply modifying f1 to receive a non-local value of x, as follows:
def f5:
global x
t1 = 3 * x
t2 = 6
return t1 + t2
Performing the same replacement exercise from before shows that f5 is still referentially transparent. However, it is not pure because it is not a function of only the arguments passed to it.
Observing carefully, the reason we lose referential transparency moving from f3 to f4 is that x is modified. In the general case, making a variable final (or those familiar with Scala, using vals instead of vars) and using immutable objects can help keep a function referentially transparent. This makes them more like variables in the algebraic or mathematical sense, thus lending themselves better to formal verification.

What is an invariant?

The word seems to get used in a number of contexts. The best I can figure is that they mean a variable that can't change. Isn't that what constants/finals (darn you Java!) are for?
An invariant is more "conceptual" than a variable. In general, it's a property of the program state that is always true. A function or method that ensures that the invariant holds is said to maintain the invariant.
For instance, a binary search tree might have the invariant that for every node, the key of the node's left child is less than the node's own key. A correctly written insertion function for this tree will maintain that invariant.
As you can tell, that's not the sort of thing you can store in a variable: it's more a statement about the program. By figuring out what sort of invariants your program should maintain, then reviewing your code to make sure that it actually maintains those invariants, you can avoid logical errors in your code.
It is a condition you know to always be true at a particular place in your logic and can check for when debugging to work out what has gone wrong.
The magic of wikipedia: Invariant (computer science)
In computer science, a predicate that,
if true, will remain true throughout a
specific sequence of operations, is
called (an) invariant to that
sequence.
This answer is for my 5 year old kid. Do not think of an invariant as a constant or fixed numerical value. But it can be. However, it is more than that.
Rather, an invariant is something like of a fixed relationship between varying entities. For example, your age will always be less than that compared to your biological parents. Both your age, and your parent's age changes in the passage of time, but the relationship that i mentioned above is an invariant.
An invariant can also be a numerical constant. For example, the value of pi is an invariant ratio between the circle's circumference over its diameter. No matter how big or small the circle is, that ratio will always be pi.
I usually view them more in terms of algorithms or structures.
For example, you could have a loop invariant that could be asserted--always true at the beginning or end of each iteration. That is, if your loop was supposed to process a collection of objects from one stack to another, you could say that |stack1|+|stack2|=c, at the top or bottom of the loop.
If the invariant check failed, it would indicate something went wrong. In this example, it could mean that you forgot to push the processed element onto the final stack, etc.
As this line states:
In computer science, a predicate that, if true, will remain true throughout a specific sequence of operations, is called (an) invariant to that sequence.
To better understand this hope this example in C++ helps.
Consider a scenario where you have to get some values and get the total count of them in a variable called as count and add them in a variable called as sum
The invariant (again it's more like a concept):
// invariant:
// we have read count grades so far, and
// sum is the sum of the first count grades
The code for the above would be something like this,
int count=0;
double sum=0,x=0;
while (cin >> x) {
++count;
sum+=x;
}
What the above code does?
1) Reads the input from cin and puts them in x
2) After one successful read, increment count and sum = sum + x
3) Repeat 1-2 until read stops ( i.e ctrl+D)
Loop invariant:
The invariant must be True ALWAYS. So initially you start out your code with just this
while(cin>>x){
}
This loop reads data from standard input and stores in x. Well and good. But the invariant becomes false because the first part of our invariant wasn't followed (or kept true).
// we have read count grades so far, and
How to keep the invariant true?
Simple! increment count.
So ++count; would do good!. Now our code becomes something like this,
while(cin>>x){
++count;
}
But
Even now our invariant (a concept which must be TRUE) is False because now we didn't satisfy the second part of our invariant.
// sum is the sum of the first count grades
So what to do now?
Add x to sum and store it in sum ( sum+=x) and the next time
cin>>x will read a new value into x.
Now our code becomes something like this,
while(cin>>x){
++count;
sum+=x;
}
Let's check
Whether code matches our invariant
// invariant:
// we have read count grades so far, and
// sum is the sum of the first count grades
code:
while(cin>>x){
++count;
sum+=x;
}
Ah!. Now the loop invariant is True always and code works fine.
The above example was taken and modified from the book Accelerated C++ by Andrew-koening and Barbara-E
Something that doesn't change within a block of code
All the answers here are great, but i felt that i can shed more light on the matter:
Invariant from a language point of view means something that never changes. The concept though comes actually from math, it's one of the popular proof techniques when combined with induction.
Here is how a proof goes, If you can find an invariant that is in the initial state, And that this invariant persists regardless of any [legal] transformation applied to the state, then you can prove that If a certain state does not have this invariant then it can never occur, no matter what sequence of transformations are applied to the initial state.
Now the previous way of thinking (again combined with induction) makes it possible to predicate the logic of computer software. Especially important when the execution goes in loops, in which an invariant can be used to prove that a certain loop will yield a certain result or that it will never change the state of a program in a certain way.
When invariant is used to predicate a loop logic its called loop invariant. It can be used outside loops, but for loops it is really important, because you often have a lot of possibilities, or an infinite number of possibilities.
Notice that i use the word "predicate" the logic of a computer software, and not prove. And that's because while in math invariant can be used as a proof, it can never prove that the computer software when executed will yield what is expected, due to the fact that the software is executed on top of many abstractions, that can never be proved that they will yield what is expected (think of the hardware abstraction for example).
Finally while theoretically and rigorously predicting software logic is only important for high critical applications like Medical, and Military ones. Invariant can still be used to aid the typical programmer when debugging. It can be used to know where at a certain location The program failed because it has failed to maintain a certain invariant - many of us use it anyway without giving a thought about it.
Class Invariant
Class Invariant is a condition which should be always true before and after calling relevant function
For example balanced tree has an Invariant which is called isBalanced. When you modify your tree through some methods (e.g. addNode, removeNode...) - isBalanced should be always true before and after modifying the tree
Following on from what it is, invariants are quite useful in writing clean code, since knowing conceptually what invariants should be present in your code allows you to easily decide how to organize your code to reach those aims. As mentioned ealier, they're also useful in debugging, as checking to see if the invariant's being maintained is often a good way of seeing if whatever manipulation you're attempting to perform is actually doing what you want it to.
It's typically a quantity that does not change under certain mathematical operations.
An example is a scalar, which does not change under rotations. In magnetic resonance imaging, for example, it is useful to characterize a tissue property by a rotational invariant, because then its estimation ideally does not depend on the orientation of the body in the scanner.
The ADT invariant specifes relationships
among the data fields (instance variables)
that must always be true before and after
the execution of any instance method.
There is an excellent example of an invariant and why it matters in the book Java Concurrency in Practice.
Although Java-centric, the example describes some code that is responsible for calculating the factors of a provided integer. The example code attempts to cache the last number provided, and the factors that were calculated to improve performance. In this scenario there is an invariant that was not accounted for in the example code which has left the code susceptible to race conditions in a concurrent scenario.