Guidelines for listing the order of function arguments - language-agnostic

Are there any rules that you follow to determine the order of function arguments? For example, float pow(float x, float exponent) vs float pow(float exponent, float x). For concreteness, C++ could be used, but the question is valid for all programming languages.
My main concern is from the usability point of view, not runtime performance.
Edit:
Some possible bases for ordering could be:
Inputs versus Output
The way a "formula" is usually written, i.e., arguments from left-to-write.
Specificity to the argument to the context of the function, i.e., whether it is a "general" argument, e.g., a singleton object of the system, or specific.

In the example you cite, I think the order was decided on the basis of the mathematical notation xexponent, in which the base is written before the exponent and becomes the left parameter.
I'm not aware of any really sound general principle other than to try to imagine what your users will expect and/or easily remember. People aren't even wholly agreed whether you should write (source, destination) or (destination, source) when copying (compare std::copy with std::memcpy), although I'm pretty sure that the former is now much more common.
There are a whole lot of general conventions, though, followed to different extents by different people:
if the function is considered primarly to act upon a particular object, put it first
parameters that are considered to "configure" the operation of the function come after parameters that are considered the main subject of the function.
out-params come last (but I suspect some people follow the reverse)
To some extent it doesn't really matter -- namely the extent to which your users have IDEs that tell them the parameter order as they type the function name.

Related

What is an Abstract Syntax Tree/Is it needed?

I've been interested in compiler/interpreter design/implementation for as long as I've been programming (only 5 years now) and it's always seemed like the "magic" behind the scenes that nobody really talks about (I know of at least 2 forums for operating system development, but I don't know of any community for compiler/interpreter/language development). Anyways, recently I've decided to start working on my own, in hopes to expand my knowledge of programming as a whole (and hey, it's pretty fun :). So, based off the limited amount of reading material I have, and Wikipedia, I've developed this concept of the components for a compiler/interpreter:
Source code -> Lexical Analysis -> Abstract Syntax Tree -> Syntactic Analysis -> Semantic Analysis -> Code Generation -> Executable Code.
(I know there's more to code generation and executable code, but I haven't gotten that far yet :)
And with that knowledge, I've created a very basic lexer (in Java) to take input from a source file, and output the tokens into another file. A sample input/output would look like this:
Input:
int a := 2
if(a = 3) then
print "Yay!"
endif
Output (from lexer):
INTEGER
A
ASSIGN
2
IF
L_PAR
A
COMP
3
R_PAR
THEN
PRINT
YAY!
ENDIF
Personally, I think it would be really easy to go from there to syntactic/semantic analysis, and possibly even code generation, which leads me to question: Why use an AST, when it seems that my lexer is doing just as good a job? However, 100% of my sources I use to research this topic all seem adamant that this is a necessary part of any compiler/interpreter. Am I missing the point of what an AST really is (a tree that shows the logical flow of a program)?
TL;DR: Currently in route to develop a compiler, finished the lexer, seems to me like the output would make for easy syntactic analysis/semantic analysis, rather than doing an AST. So why use one? Am I missing the point of one?
Thanks!
First off, one thing about your list of components does not make sense. Building an AST is (pretty much) the syntactic analysis, so it either shouldn't be in there, or at least come before the AST.
What you got there is a lexer. All it gives you are individual tokens. In any case, you will need an actual parser, because regular languages aren't any fun to program in. You can't even (properly) nest expressions. Heck, you can't even handle operator precedence. A token stream doesn't give you:
An idea where statements and expressions start and end.
An idea how statements are grouped into blocks.
An idea Which part of the expression has which precedence, associativity, etc.
A clear, uncluttered view at the actual structure of the program.
A structure which can be passed through a myriad of transformations, without every single pass knowing and having code to accomodate that the condition in an if is enclosed by parentheses.
... more generally, any kind of comprehension above the level of a single token.
Suppose you have two passes in your compiler which optimize certain kinds of operators applies to certain arguments (say, constant folding and algebraic simplifications like x - x -> 0). If you hand them tokens for the expression x - x * 1, these passes are cluttered with figuring out that the x * 1 part comes first. And they have to know that, lest the transformation is incorrect (consider 1 + 2 * 3).
These things are tricky enough to get right as it is, so you don't want to be pestered by parsing problems as well. That's why you solve the parsing problem first, in a separate parsing step. Then you can, say, replace a function call with its definition, without worrying about adding parenthesis so the meaning remains the same. You save time, you separate concerns, you avoid repetition, you enable simpler code in many other places, etc.
A parser figures all that out, and builds an AST which consequently holds all that information. Without any further data on the nodes, the shape of the AST alone gives you no. 1, 2, 3, and much more, for free. None of the bazillion passes that follow have to worry about it anymore.
That's not to say you always have to have an AST. For sufficiently simple languages, you can do a single-pass compiler. Instead of generating an AST or some other intermediate representation during parsing, you emit code as you go. However, this becomes harder for less simple languages and you can't reasonably do a lot of stuff (such as 70% of all optimizations and diagnostics -- and yes I just made that number up). Generally, I wouldn't advise you to do this. There are good reasons single-pass compilers are mostly dead. Even languages which permit them (e.g. C) are nowadays implemented with multiple passes and ASTs. It's a simple way to get started, but will severely limit you (and the language, if you design it) later.
You've got the AST at the wrong point in your flow diagram. Typically, the output of the lexer is a series of tokens (as you have in your output), and these are fed to the parser/syntactic analyzer, which generates the AST. So the output of your lexer is different from an AST because they are used at different points in the compilation process and fulfill different purposes.
The next logical question is: What, then, is an AST? Well, the purpose of parsing/syntactic analysis is to turn the series of tokens generated by the lexer into an AST (or parse tree). The AST is an intermediate representation that captures the relationship between syntactical elements in a way that is easier to work with programmatically. One way of thinking about this is that a text program is a one dimensional construct, and can only represent ideas as a sequence of elements, while the AST is freed from this constraint, and can represent the underlying relationships between those elements in 2 dimensions (as typically drawn), or any higher dimension space if you so choose to think about it that way.
For instance, a binary operator has two operands, let's call them A and B. In code, this may be spelled 'A * B' (assuming an infix operator - another advantage of an AST is to hide such distinctions that may be important syntactically, but not semantically), but for the compiler to "understand" this expression, it must read 5 characters sequentially, and this logic can quickly become cumbersome, given the many possibilities in even a small language. In an AST representation, however, we have a "binary operator" node whose value is '*', and that node has two children, values 'A' and 'B'.
As your compiler project progresses, I think you will begin to see the advantages of this representation.

In which languages is function abstraction not primitive

In Haskell function type (->) is given, it's not an algebraic data type constructor and one cannot re-implement it to be identical to (->).
So I wonder, what languages will allow me to write my version of (->)? How does this property called?
UPD Reformulations of the question thanks to the discussion:
Which languages don't have -> as a primitive type?
Why -> is necessary primitive?
I can't think of any languages that have arrows as a user defined type. The reason is that arrows -- types for functions -- are baked in to the type system, all the way down to the simply typed lambda calculus. That the arrow type must fundamental to the language comes directly from the fact that the way you form functions in the lambda calculus is via lambda abstraction (which, at the type level, introduces arrows).
Although Marcin aptly notes that you can program in a point free style, this doesn't change the essence of what you're doing. Having a language without arrow types as primitives goes against the most fundamental building blocks of Haskell. (The language you reference in the question.)
Having the arrow as a primitive type also shares some important ties to constructive logic: you can read the function arrow type as implication from intuition logic, and programs having that type as "proofs." (Namely, if you have something of type A -> B, you have a proof that takes some premise of type A, and produces a proof for B.)
The fact that you're perturbed by the use of having arrows baked into the language might imply that you're not fundamentally grasping why they're so tied to the design of the language, perhaps it's time to read a few chapters from Ben Pierce's "Types and Programming Languages" link.
Edit: You can always look at languages which don't have a strong notion of functions and have their semantics defined with respect to some other way -- such as forth or PostScript -- but in these languages you don't define inductive data types in the same way as in functional languages like Haskell, ML, or Coq. To put it another way, in any language in which you define constructors for datatypes, arrows arise naturally from the constructors for these types. But in languages where you don't define inductive datatypes in the typical way, you don't get arrow types as naturally because the language just doesn't work that way.
Another edit: I will stick in one more comment, since I thought of it last night. Function types (and function abstraction) forms the basis of pretty much all programming languages -- at least at some level, even if it's "under the hood." However, there are languages designed to define the semantics of other languages. While this doesn't strictly match what you're talking about, PLT Redex is one such system, and is used for specifying and debugging the semantics of programming languages. It's not super useful from a practitioners perspective (unless your goal is to design new languages, in which case it is fairly useful), but maybe that fits what you want.
Do you mean meta-circular evaluators like in SICP? Being able to write your own DSL? If you create your own "function type", you'll have to take care of "applying" it, yourself.
Just as an example, you could create your own "function" in C for instance, with a look-up table holding function pointers, and use integers as functions. You'd have to provide your own "call" function for such "functions", of course:
void call( unsigned int function, int data) {
lookup_table[function](data);
}
You'd also probably want some means of creating more complex functions from primitive ones, for instance using arrays of ints to signify sequential execution of your "primitive functions" 1, 2, 3, ... and end up inventing whole new language for yourself.
I think early assemblers had no ability to create callable "macros" and had to use GOTO.
You could use trampolining to simulate function calls. You could have only global variables store, with shallow binding perhaps. In such language "functions" would be definable, though not primitive type.
So having functions in a language is not necessary, though it is convenient.
In Common Lisp defun is nothing but a macro associating a name and a callable object (though lambda is still a built-in). In AutoLisp originally there was no special function type at all, and functions were represented directly by quoted lists of s-expressions, with first element an arguments list. You can construct your function through use of cons and list functions, from symbols, directly, in AutoLisp:
(setq a (list (cons 'x NIL) '(+ 1 x)))
(a 5)
==> 6
Some languages (like Python) support more than one primitive function type, each with its calling protocol - namely, generators support multiple re-entry and returns (even if syntactically through the use of same def keyword). You can easily imagine a language which would let you define your own calling protocol, thus creating new function types.
Edit: as an example consider dealing with multiple arguments in a function call, the choice between automatic currying or automatical optional args etc. In Common LISP say, you could easily create yourself two different call macros to directly represent the two calling protocols. Consider functions returning multiple values not through a kludge of aggregates (tuples, in Haskell), but directly into designated recepient vars/slots. All are different types of functions.
Function definition is usually primitive because (a) functions are how programmes get things done; and (b) this sort of lambda-abstraction is necessary to be able to programme in a pointful style (i.e. with explicit arguments).
Probably the closest you will come to a language that meets your criteria is one based on a purely pointfree model which allows you to create your own lambda operator. You might like to explore pointfree languages in general, and ones based on SKI calculus in particular: http://en.wikipedia.org/wiki/SKI_combinator_calculus
In such a case, you still have primitive function types, and you always will, because it is a fundamental element of the type system. If you want to get away from that at all, probably the best you could do would be some kind of type system based on a category-theoretic generalisation of functions, such that functions would be a special case of another type. See http://en.wikipedia.org/wiki/Category_theory.
Which languages don't have -> as a primitive type?
Well, if you mean a type that can be named, then there are many languages that don't have them. All languages where functions are not first class citiziens don't have -> as a type you could mention somewhere.
But, as #Kristopher eloquently and excellently explained, functions are (or can, at least, perceived as) the very basic building blocks of all computation. Hence even in Java, say, there are functions, but they are carefully hidden from you.
And, as someone mentioned assembler - one could maintain that the machine language (of most contemporary computers) is an approximation of the model of the register machine. But how it is done? With millions and billions of logical circuits, each of them being a materialization of quite primitive pure functions like NOT or NAND, arranged in a certain physical order (which is, obviously, the way hardware engeniers implement function composition).
Hence, while you may not see functions in machine code, they're still the basis.
In Martin-Löf type theory, function types are defined via indexed product types (so-called Π-types).
Basically, the type of functions from A to B can be interpreted as a (possibly infinite) record, where all the fields are of the same type B, and the field names are exactly all the elements of A. When you need to apply a function f to an argument x, you look up the field in f corresponding to x.
The wikipedia article lists some programming languages that are based on Martin-Löf type theory. I am not familiar with them, but I assume that they are a possible answer to your question.
Philip Wadler's paper Call-by-value is dual to call-by-name presents a calculus in which variable abstraction and covariable abstraction are more primitive than function abstraction. Two definitions of function types in terms of those primitives are provided: one implements call-by-value, and the other call-by-name.
Inspired by Wadler's paper, I implemented a language (Ambidexer) which provides two function type constructors that are synonyms for types constructed from the primitives. One is for call-by-value and one for call-by-name. Neither Wadler's dual calculus nor Ambidexter provides user-defined type constructors. However, these examples show that function types are not necessarily primitive, and that a language in which you can define your own (->) is conceivable.
In Scala you can mixin one of the Function traits, e.g. a Set[A] can be used as A => Boolean because it implements the Function1[A,Boolean] trait. Another example is PartialFunction[A,B], which extends usual functions by providing a "range-check" method isDefinedAt.
However, in Scala methods and functions are different, and there is no way to change how methods work. Usually you don't notice the difference, as methods are automatically lifted to functions.
So you have a lot of control how you implement and extend functions in Scala, but I think you have a real "replacement" in mind. I'm not sure this makes even sense.
Or maybe you are looking for languages with some kind of generalization of functions? Then Haskell with Arrow syntax would qualify: http://www.haskell.org/arrows/syntax.html
I suppose the dumb answer to your question is assembly code. This provides you with primitives even "lower" level than functions. You can create functions as macros that make use of register and jump primitives.
Most sane programming languages will give you a way to create functions as a baked-in language feature, because functions (or "subroutines") are the essence of good programming: code reuse.

Purity vs Referential transparency

The terms do appear to be defined differently, but I've always thought of one implying the other; I can't think of any case when an expression is referentially transparent but not pure, or vice-versa.
Wikipedia maintains separate articles for these concepts and says:
From Referential transparency:
If all functions involved in the
expression are pure functions, then
the expression is referentially
transparent. Also, some impure
functions can be included in the
expression if their values are
discarded and their side effects are
insignificant.
From Pure expressions:
Pure functions are required to
construct pure expressions. [...] Pure
expressions are often referred to as
being referentially transparent.
I find these statements confusing. If the side effects from a so-called "impure function" are insignificant enough to allow not performing them (i.e. replace a call to such a function with its value) without materially changing the program, it's the same as if it were pure in the first place, isn't it?
Is there a simpler way to understand the differences between a pure expression and a referentially transparent one, if any? If there is a difference, an example expression that clearly demonstrates it would be appreciated.
If I gather in one place any three theorists of my acquaintance, at least two of them disagree on the meaning of the term "referential transparency." And when I was a young student, a mentor of mine gave me a paper explaining that even if you consider only the professional literature, the phrase "referentially transparent" is used to mean at least three different things. (Unfortunately that paper is somewhere in a box of reprints that have yet to be scanned. I searched Google Scholar for it but I had no success.)
I cannot inform you, but I can advise you to give up: Because even the tiny cadre of pointy-headed language theorists can't agree on what it means, the term "referentially transparent" is not useful. So don't use it.
P.S. On any topic to do with the semantics of programming languages, Wikipedia is unreliable. I have given up trying to fix it; the Wikipedian process seems to regard change and popular voting over stability and accuracy.
All pure functions are necessarily referentially transparent. Since, by definition, they cannot access anything other than what they are passed, their result must be fully determined by their arguments.
However, it is possible to have referentially transparent functions which are not pure. I can write a function which is given an int i, then generates a random number r, subtracts r from itself and places it in s, then returns i - s. Clearly this function is impure, because it is generating random numbers. However, it is referentially transparent. In this case, the example is silly and contrived. However, in, e.g., Haskell, the id function is of type a - > a whereas my stupidId function would be of type a -> IO a indicating that it makes use of side effects. When a programmer can guarantee through means of an external proof that their function is actually referentially transparent, then they can use unsafePerformIO to strip the IO back away from the type.
I'm somewhat unsure of the answer I give here, but surely somebody will point us in some direction. :-)
"Purity" is generally considered to mean "lack of side-effects". An expression is said to be pure if its evaluation lacks side-effects. What's a side-effect then? In a purely functional language, side-effect is anything that doesn't go by the simple beta-rule (the rule that to evaluate function application is the same as to substitute actual parameter for all free occurrences of the formal parameter).
For example, in a functional language with linear (or uniqueness, this distinction shouldn't bother at this moment) types some (controlled) mutation is allowed.
So I guess we have sorted out what "purity" and "side-effects" might be.
Referential transparency (according to the Wikipedia article you cited) means that variable can be replaced by the expression it denotes (abbreviates, stands for) without changing the meaning of the program at hand (btw, this is also a hard question to tackle, and I won't attempt to do so here). So, "purity" and "referential transparency" are indeed different things: "purity" is a property of some expression roughly means "doesn't produce side-effects when executed" whereas "referential transparency" is a property relating variable and expression that it stands for and means "variable can be replaced with what it denotes".
Hopefully this helps.
These slides from one ACCU2015 talk have a great summary on the topic of referential transparency.
From one of the slides:
A language is referentially transparent if (a)
every subexpression can be replaced by any other
that’s equal to it in value and (b) all occurrences of
an expression within a given context yield the
same value.
You can have, for instance, a function that logs its computation to the program standard output (so, it won't be a pure function), but you can replace calls for this function by a similar function that doesn't log its computation. Therefore, this function have the referential transparency property. But... the above definition is about languages, not expressions, as the slides emphasize.
[...] it's the same as if it were pure in the first place, isn't it?
From the definitions we have, no, it is not.
Is there a simpler way to understand the differences between a pure expression and a referentially transparent one, if any?
Try the slides I mentioned above.
The nice thing about standards is that there are so many of them to choose
from.
Andrew S. Tanenbaum.
...along with definitions of referential transparency:
from page 176 of Functional programming with Miranda by Ian Holyer:
8.1 Values and Behaviours
The most important property of the semantics of a pure functional language is that the declarative and operational views of the language coincide exactly, in the following way:
Every expression denotes a value, and there are valuescorresponding to all possible program behaviours. Thebehaviour produced by an expression in any context is completely determined by its value, and vice versa.
This principle, which is usually rather opaquely called referential transparency, can also be pictured in the following way:
and from Nondeterminism with Referential Transparency in Functional Programming Languages by F. Warren Burton:
[...] the property that an expression always has the same value in the same environment [...]
...for various other definitions, see Referential Transparency, Definiteness and Unfoldability by Harald Søndergaard and Peter Sestoft.
Instead, we'll begin with the concept of "purity". For the three of you who didn't know it already, the computer or device you're reading this on is a solid-state Turing machine, a model of computing intrinsically connected with effects. So every program, functional or otherwise, needs to use those effects To Get Things DoneTM.
What does this mean for purity? At the assembly-language level, which is the domain of the CPU, all programs are impure. If you're writing a program in assembly language, you're the one who is micro-managing the interplay between all those effects - and it's really tedious!
Most of the time, you're just instructing the CPU to move data around in the computer's memory, which only changes the contents of individual memory locations - nothing to see there! It's only when your instructions direct the CPU to e.g. write to video memory, that you observe a visible change (text appearing on the screen).
For our purposes here, we'll split effects into two coarse categories:
those involving I/O devices like screens, speakers, printers, VR-headsets, keyboards, mice, etc; commonly known as observable effects.
and the rest, which only ever change the contents of memory.
In this situation, purity just means the absence of those observable effects, the ones which cause a visible change to the environment of the running program, maybe even its host computer. It is definitely not the absence of all effects, otherwise we would have to replace our solid-state Turing machines!
Now for the question of 42 life, the Universe and everything what exactly is meant by the term "referential transparency" - instead of herding cats trying to bring theorists into agreement, let's just try to find the original meaning given to the term. Fortunately for us, the term frequently appears in the context of I/O in Haskell - we only need a relevant article...here's one: from the first page of Owen Stephen's Approaches to Functional I/O:
Referential transparency refers to the ability to replace a sub-expression with one of equal value, without changing the value of the outer expression. Originating from Quine the term was introduced to Computer Science by Strachey.
Following the references:
From page 9 of 39 in Christopher Strachey's Fundamental Concepts in Programming Languages:
One of the most useful properties of expressions is that called by Quine referential transparency. In essence this means that if we wish to find the value of an expression which contains a sub-expression, the only thing we need to know about the sub-expression is its value. Any other features of the sub-expression, such as its internal structure, the number and nature of its components, the order in which they are evaluated or the colour of the ink in which they are written, are irrelevant to the value of the main expression.
From page 163 of 314 in Willard Van Ormond Quine's Word and Object:
[...] Quotation, which thus interrupts the referential force of a term, may be said to fail of referential transparency2. [...] I call a mode of confinement Φ referentially transparent if, whenever an occurrence of a singular term t is purely referential in a term or sentence ψ(t), it is purely referential also in the containing term or sentence Φ(ψ(t)).
with the footnote:
2 The term is from Whitehead and Russell, 2d ed., vol. 1, p. 665.
Following that reference:
From page 709 of 719 in Principa Mathematica by Alfred North Whitehead and Bertrand Russell:
When an assertion occurs, it is made by means of a particular fact, which is an instance of the proposition asserted. But this particular fact is, so to speak, "transparent"; nothing is said about it, bit by means of it something is said about something else. It is the "transparent" quality which belongs to propositions as they occur in truth-functions.
Let's try to bring all that together:
Whitehead and Russell introduce the term "transparent";
Quine then defines the qualified term "referential transparency";
Strachey then adapts Quine's definition in defining the basics of programming languages.
So it's a choice between Quine's original or Strachey's adapted definition. You can try translating Quine's definition for yourself if you like - everyone who's ever contested the definition of "purely functional" might even enjoy the chance to debate something different like what "mode of containment" and "purely referential" really means...have fun! The rest of us will just accept that Strachey's definition is a little vague ("In essence [...]") and continue on:
One useful property of expressions is referential transparency. In essence this means that if we wish to find the value of an expression which contains a sub-expression,
the only thing we need to know about the sub-expression is its value. Any other features of the sub-expression, such as its internal structure, the number and nature of
its components, the order in which they are evaluated or the colour of the ink in which they are written, are irrelevant to the value of the main expression.
(emphasis by me.)
Regarding that description ("that if we wish to find the value of [...]"), a similar, but more concise statement is given by Peter Landin in The Next 700 Programming Languages:
the thing an expression denotes, i.e., its "value", depends only on the values of its sub-expressions, not on other properties of them.
Thus:
One useful property of expressions is referential transparency. In essence this means the thing an expression denotes, i.e., its "value", depends only on the values of its sub-expressions, not on other properties of them.
Strachey provides some examples:
(page 12 of 39)
We tend to assume automatically that the symbol x in an expression such as 3x2 + 2x + 17 stands for the same thing (or has the same value) on each occasion it occurs. This is the most important consequence of referential transparency and it is only in virtue of this property that we can use the where-clauses or λ-expressions described in the last section.
(and on page 16)
When the function is used (or called or applied) we write f[ε] where ε can be an expression. If we are using a referentially transparent language all we require to know about the expression ε in order to evaluate f[ε] is its value.
So referential transparency, by Strachey's original definition, implies purity - in the absence of an order of evaluation, observable and other effects are practically useless...
I'll quote John Mitchell Concept in programming language. He defines pure functional language has to pass declarative language test which is free from side-effects or lack of side effects.
"Within the scope of specific deceleration of x1,...,xn , all occurrence of an expression e containing only variables x1,...,xn have the same value."
In linguistics a name or noun phrase is considered referentially transparent if it may be replaced with the another noun phrase with same referent without changing the meaning of the sentence it contains.
Which in 1st case holds but in 2nd case it gets too weird.
Case 1:
"I saw Walter get into his new car."
And if Walter own a Centro then we could replace that in the given sentence as:
"I saw Walter get into his Centro"
Contrary to first :
Case #2 : He was called William Rufus because of his read beard.
Rufus means somewhat red and reference was to William IV of England.
"He was called William IV because of his read beard." looks too awkward.
Traditional way to say is, a language is referentially transparent if we may replace one expression with another of equal value anywhere in the program without changing the meaning of the program.
So, referential transparency is a property of pure functional language.
And if your program is free from side effects then this property will hold.
So give it up is awesome advice but get it on might also look good in this context.
Pure functions are those that return the same value on every call, and do not have side effects.
Referential transparency means that you can replace a bound variable with its value and still receive the same output.
Both pure and referentially transparent:
def f1(x):
t1 = 3 * x
t2 = 6
return t1 + t2
Why is this pure?
Because it is a function of only the input x and has no side-effects.
Why is this referentially transparent?
You could replace t1 and t2 in f1 with their respective right hand sides in the return statement, as follows
def f2(x):
return 3 * x + 6
and f2 will still always return the same result as f1 in every case.
Pure, but not referentially transparent:
Let's modify f1 as follows:
def f3(x):
t1 = 3 * x
t2 = 6
x = 10
return t1 + t2
Let us try the same trick again by replacing t1 and t2 with their right hand sides, and see if it is an equivalent definition of f3.
def f4(x):
x = 10
return 3 * x + 6
We can easily observe that f3 and f4 are not equivalent on replacing variables with their right hand sides / values. f3(1) would return 9 and f4(1) would return 36.
Referentially transparent, but not pure:
Simply modifying f1 to receive a non-local value of x, as follows:
def f5:
global x
t1 = 3 * x
t2 = 6
return t1 + t2
Performing the same replacement exercise from before shows that f5 is still referentially transparent. However, it is not pure because it is not a function of only the arguments passed to it.
Observing carefully, the reason we lose referential transparency moving from f3 to f4 is that x is modified. In the general case, making a variable final (or those familiar with Scala, using vals instead of vars) and using immutable objects can help keep a function referentially transparent. This makes them more like variables in the algebraic or mathematical sense, thus lending themselves better to formal verification.

Are idempotent functions the same as pure functions?

I read Wikipedia's explanation of idempotence.
I know it means a function's output is determined by it's input.
But I remember that I heard a very similar concept: pure function.
I Google them but can't find their difference...
Are they equivalent?
An idempotent function can cause idempotent side-effects.
A pure function cannot.
For example, a function which sets the text of a textbox is idempotent (because multiple calls will display the same text), but not pure.
Similarly, deleting a record by GUID (not by count) is idempotent, because the row stays deleted after subsequent calls. (additional calls do nothing)
A pure function is a function without side-effects where the output is solely determined by the input - that is, calling f(x) will give the same result no matter how many times you call it.
An idempotent function is one that can be applied multiple times without changing the result - that is, f(f(x)) is the same as f(x).
A function can be pure, idempotent, both, or neither.
No, an idempotent function will change program/object/machine state - and will make that change only once (despite repeated calls). A pure function changes nothing, and continues to provide a (return) result each time it is called.
Functional purity means that there are no side effects. On the other hand, idempotence means that a function is invariant with respect to multiple calls.
Every pure function is side effect idempotent because pure functions never produce side effects even if they are called more then once. However, return value idempotence means that f(f(x)) = f(x) which is not effected by purity.
A large source of confusion is that in computer science, there seems to be different definitions for idempotence in imperative and functional programming.
From wikipedia (https://en.wikipedia.org/wiki/Idempotence#Computer_science_meaning)
In computer science, the term idempotent is used more comprehensively to describe an operation that will produce the same results if executed once or multiple times. This may have a different meaning depending on the context in which it is applied. In the case of methods or subroutine calls with side effects, for instance, it means that the modified state remains the same after the first call. In functional programming, though, an idempotent function is one that has the property f(f(x)) = f(x) for any value x.
Since a pure function does not produce side effects, i am of the opinion that idempotence has nothing to do with purity.
I've found more places where 'idempotent' is defined as f(f(x)) = f(x) but I really don't believe that's accurate.
Instead I think this definition is more accurate (but not totally):
describing an action which, when performed multiple times on the same
subject, has no further effect on its subject after the first time it
is performed. A projection operator is idempotent.
The way I interpret this, if we apply f on x (the subject) twice like:
f(x);f(x);
then the (side-)effect is the same as
f(x);
Because pure functions dont allow side-effects then pure functions are trivially also 'idempotent'.
A more general (and more accurate) definition of idempotent also includes functions like
toggle(x)
We can say the degree of idempotency of a toggle is 2, because after applying toggle every 2 times we always end up with the same State

Is my understanding of type systems correct?

The following statements represent my understanding of type systems (which suffers from too little hands-on experience outside the Java world); please correct any errors.
The static/dynamic distinction seems pretty clear-cut:
Statically typed langauges assign each variable, field and parameter a type and the compiler prevents assignments between incompatible types. Examples: C, Java, Pascal.
Dynamically typed languages treat variables as generic bins that can hold anything you want - types are checked (if at all) only at runtime when you actually perform operations on the values, not when you assign them. Examples: Smalltalk, Python, JavaScript.
Type inference allows statically typed languages to look like (and have some of the advantages of) dynamically typed ones, by inferring types from the context so that you don't have to declare them most of the time - but unlike in dynamic languages, you cannot e.g. use a variable to hold a string initially and then assign an integer to it. Examples: Haskell, Scala
I am much less certain about the strong/weak distinction, and I suspect that it's not very clearly defined:
Strongly typed languages assign each runtime value a type and only allow operations to be performed that are defined for that type, otherwise there is an explicit type error.
Weakly typed languages don't have runtime type checks - if you try to perform an operation on a value that it does not support, the results are unpredictable. It may actually do something useful, but more likely you'll get corrupted data, a crash, or some undecipherable secondary error.
There seems to be at least two different kinds of weakly typed languages (or perhaps a continuum):
In C and assembler, values are basically buckets of bits, so anything is possible and if you get the compiler to dereference the first 4 bytes of a null-terminated string, you better hope it leads somewhere that does not contain legal machine code.
PHP and JavaScript are also generally considered weakly typed, but do not consider values to be opaque bit buckets; they will, however, perform implicit type conversions.
But these implicit conversions seem to apply mainly to string/integer/float variables - does that really warrant the classification as weakly typed? Or are there other issues where these languages's type system may obfuscate errors?
I am much less certain about the strong/weak distinction, and I suspect that it's not very clearly defined.
You are right: it isn't.
This is what Benjamin C. Pierce, author of Types and Programming Languages and Advanced Types and Programming Languages has to say:
I spent a few weeks... trying to sort out the terminology of "strongly typed," "statically typed," "safe," etc., and found it amazingly difficult.... The usage of these terms is so various as to render them almost useless.
Luca Cardelli, in his Typeful Programming article, defines it as the absence of unchecked run-time type errors. Tony Hoare calls that exact same property "security". Other papers call it "type safety" or simply "safety".
Mark-Jason Dominus wrote a classic rant about this a couple of years ago on the comp.lang.perl.moderated newsgroup, in a discussion about whether or not Perl was strongly typed. In this rant he states that within just a few hours of research, he was able to find 8 different, sometimes contradictory definitions, mostly from respected sources like college textbooks or peer-reviewed papers. In particular, those texts contained examples that were meant to help the students distinguish between strongly and weakly typed languages, and according to those examples, C is strongly typed, C is weakly typed, C++ is strongly typed, C++ is weakly typed, Lisp is strongly typed, Lisp is weakly typed, Perl is strongly typed, Perl is weakly typed. (Does that clear up any confusion?)
The only definition that I have seen consistently applied is:
strongly typed: my programming language
weakly typed: your programming language
Regarding static and dynamic typing you are dead on the money. Static typing means that programs are checked before being executed, and a program might be rejected before it starts. Dynamic typing means that the types of values are checked during execution, and a poorly typed operation might cause the program to halt or otherwise signal an error at run time. A primary reason for static typing is to rule out programs that might have such "dynamic type errors".
Bob Harper has argued that a dynamically typed language can (and should) be considered to be a statically typed language with a single type, which Bob calls "value". This view is fair, but it's helpful only in limited contexts, such as trying to be precise about the type theory of languages.
Although I think you grasp the concept, your bullets do not make it clear that type inference is simply a special case of static typing. In most languages with type inference, type annotations are optional, but not necessarily in all contexts. (Example: signatures in ML.) Advanced static type systems often give you a tradeoff between annotations and inference; for example, in Haskell you can type polymorphic functions of higher rank (forall to the left of an arrow) but only with an annotations. So, if you are willing to add an annotation, you can get the compiler to accept a program that would be rejected without the annotation. I think this is the wave of the future in type inference.
The ideas of "strong" and "weak" typing I would characterize as not useful, because they don't have a universally agreed on technical meaning. Strong typing generally means that there are no loopholes in the type system, whereas weak typing means the type system can be subverted (invalidating any guarantees). The terms are often used incorrectly to mean static and dynamic typing. To see the difference, think of C: the language is type-checked at compile time (static typing), but there are plenty of loopholes; you can pretty much cast a value of any type to another type of the same size—in particular, you can cast pointer types freely. Pascal was a language that was intended to be strongly typed but famously had an unforeseen loophole: a variant record with no tag.
Implementations of strongly typed languages often acquire loopholes over time, usually so that part of the run-time system can be implemented in the high-level language. For example, Objective Caml has a function called Obj.magic which has the run-time effect of simply returning its argument, but at compile time it converts a value of any type to one of any other type. My favorite example is Modula-3, whose designers called their type-casting construct LOOPHOLE.
I encourage you to avoid the terms "strong" and "weak" with regard to type systems, and instead say precisely what you mean, e.g., "the type system guarantees that the following class of errors cannot occur at run time" (strong), "the static type system does not protect against certain run-time errors" (weak), or "the type system has a loophole" (weak). Just calling a type system "strong" or "weak" by itself does not communicate very much.
This is a pretty accurate reflection of my own understanding of the topic of the static/dynamic, strong/weak typing discussion. In addition, you can consider those other languages:
In languages such as TCL and Bourne Shell, the "main" value type is the string. Numeric operators are available that implicitly coerce input values from string representation and result values to string representation. They can be considered examples of dynamic, weakly typed languages.
Forth may be an example of a static, weakly typed language. The language performs no type checking of its own, and the main stack may interchangeably contain pointers, integers, strings (conventionally represented as two cells, start and length). Inconsistent use of operators can lead to either interesting, or unspecified behavior. Typical Forth implementations provide a separate stack for floating point numbers.
Maybe this Book can help. Be prepared for some math though. If I remember correctly, a "non-math" statement was: "Strongly typed: A language that I feel safe to program with".
There seems to be at least two different kinds of weakly typed languages (or perhaps a continuum):
In C and assembler, values are basically buckets of bits, so anything is possible and if you get the compiler to dereference the first 4 bytes of a null-terminated string, you better hope it leads somewhere that does not contain legal machine code.
I would disagree with this statement, at least in C. You can manipulate the type system in C in such a way that you can treat any given memory location as a bucket of bits, but a variable most definitely has a type and that type has specific properties. The fact that there are no runtime checks (unless you consider floating point exceptions or segmentation faults to be runtime checks) isn't really relevant. C can be considered "weakly typed" in the sense that the compiler will perform some implicit type conversion for you, but it doesn't go very far with it.
I consider strong/weak to be the concept of implicit conversion and a good example is addition of a string and a number. In a strongly typed language the conversion won't happen (at least in all languages I can think of) and you'll get an error. Weakly typed languages like VB (with Option Explicit Off) and Javascript will try to cast one of the operands to the other type.
In VB.Net with Option Strict Off:
Dim A As String = "5"
Dim B As Integer = 5
Trace.WriteLine(A + B) 'returns 10
With Option Strict On (turning VB into a strongly typed language) you'll get a compiler error.
In Javascript:
var A = '5';
var B = 5;
alert(A + B);//returns 55
Some people will say that the results are not predictable but they actually do follow a set of rules.
Hmm, don't know much more either, but I wanted to mention C++ and its implicit converstions(implicit constructors). This might be as well an example of weak typing.
I agree with the others who say "there doesn't seem to be a hard and fast definition here." My answer tends to be based on how much rope the language gives you WRT types. If you can pretty much fake anything you want, then it's weak. If it really doesn't let you get yourself into trouble, even if you want to, it's strong.
I really haven't seen too many languages that skirt this border, so I can't say that I've ever needed a better definition that that...