Static chains and binding - language-agnostic

I'm confused about how binding works for statically scoped variables in nested subroutines.
proc A:
var a, x
...
proc B:
var x, y
...
proc B2:
var a, b
...
end B2
end B
proc C:
var x, z, w
....
end C
end A
First, this is what I have understood: if static scoping is considered, then B2 can use the variable x and y present in its parent B. Similarly C can use the variable a used in proc A.
Now, my questions are: are these bindings made during the compile-time or run-time? Does it make a difference if the variables are statically scoped or dynamically scoped?

Until it comes naturally, I find it easy to draw environment model diagrams. They are also pretty much essential for exams and those esoteric examples that are intended to be confusing. I suggest the famous SICP (http://mitpress.mit.edu/sicp/), but there are obviously more than enough resources on the internet (a quick google brought me to this: http://www.icsi.berkeley.edu/~gelbart/cs61a/EnvDiagrams.pdf).
It depends on the language/implementation when/how bindings are done, however in your example the bindings can be done at compile time. In general, static scoping, as the name suggests allows for a lot of static/compile-time binding. A compiler can look into a function and see all references and resolve them immediately. For example in B2, a reference to y can be resolved immediately to belong to the enclosing scope, i.e. that of B.
As per dynamic vs. static scoping, there is a huge difference. Dynamic, as the name suggests, is much harder to do compile-time bindings with, since the structure of the code does not define the references to the variables. Different paths of execution may yield different bindings. You'll have to be more specific with the question though.

Related

How is 'pass by reference' implemented without actually passing an address to a function? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am well aware of the fact that in C and C++ everything is passed by value (even if that value is of reference type). I think (but I'm no expert there) the same is true for Java.
So, and that's why I include language-agnostic as a tag, in what language can I pass anything to a function without passing some value?
And if that exists, what does the mechanism look like? I thought hard about that, and I fail to come up with any mechanism that does not involve the passing of a value.
Even if the compiler optimizes in a way that I don't have a pointer/reference as a true variable in memory, it still has to calculate an address as an offset from the stack (frame) pointer - and pass that.
Anybody who could enlighten me?
From C perspective:
There are no references as a language level concept. Objects are referred to by pointing at them with pointers.
The value of a pointer is the address of the pointed object. Pointers are passed by value just like any other arguments. A pointed object is conceptually passed by reference.
At least from C++ perspective:
How is 'pass by reference' implemented [...] ?
Typically, by copying the address of the object.
... without actually passing an address to a function?
If a function invocation is expanded inline, there is no need to copy the address anywhere. Same applies to pointers too, because copies may be elided due to the as-if rule.
in what language can I pass anything to a function without passing some value?
Such language would have to have significantly difference concept of a function than C. There would have to be no stack frame push.
Function-like C pre-processor macros, as their name implies, are similar to functions, but their arguments are not passed around at runtime, because pre-processing happens before compilation.
On the other hand, you can have global variables. If you change the global state of the program, and call a function with no arguments, you have conceptually "passed the new global state to the function" without having passed any value.
At a machine-code level, "pass X by reference" is essentially "pass the address of X by value".
Pointers are values. Valuea ars values. Values have a unique identity, require storage.
References are not values. References have no identity. If we have:
int x=0;
int& y=x;
int& z=x;
both y and z are references to x, and they have no independent identity.
In comparison:
int x=0;
int* py=&x;
int* pz=&x;
both py and pz are pointers at x, and they have independent identity. You could modify py and not pz, you can get a size of them, you can memset them.
In some circumstances, at the machine code level, references are implemented the same way as pointers, except certain operations are never performed on them (like reaiming them).
But C++ is not defined in terms of machine code. It is defined innterms of the behaviour of an abstract machine. Compilers compile your code to operations on this abstract machine, which has no fixed calling convention (by the standard), no layout for references, no stack, no heap, etc. It then does arbitrary transformations on this that do not change the as-if behaviour (a common one is single assignment), rearranges things, and then at some point emits assembly/machine code that generates similar behaviour on the actual hardware you are running on.
Now the near universal way to compile C++ is the compilation unit/linker model, where functions are exported as symbols and a fixed ABI calling convention is provided for other compilation units to use them. Then at link stage the compilation units are connected together.
In those ABIs, references are passed as pointers.
How is 'pass by reference' implemented without actually passing an address to a function?
Within the context of the C languages, the short answers are:
In C, it is not.
In C++, a type followed by an ampersand (&) is a reference type.
For instance, int& is a reference to an int. When passing an argument
to a function that takes reference type, the object is truly passed
by reference. (More on this in the scholarly link below.)
But in truth, most of the confusion is semantics. Some of the confusion could be helped by:
1) Stop using the word emulated to describe passing an address.
2) Stop using the word reference to describe address
Or
3) Recognize that within the context of the C/C++ languages, in the
phrase pass-by-reference, the word reference is defined as: value of
address.
Beyond this, there are many examples of illusions and concepts created to convey impossible ideas. The concept of non-emulated pass-by-reference is arguably one of them, no matter how many scholarly papers or practical discussions.
This one (scholarly paper category) is yet another that presents a distinction between emulated and actual pass-by-reference in a discussion using both C & C++, but who's conclusions stick closely to reality. The following is an excerpt:
...Somehow, it is only a matter of how the concept of “passing by reference” is actually realized by a programming language: C implements this by using pointers and passing them by value to functions whereas C++ provides two implementations. From a side, it reuses the same mechanism derived from C (i.e., pointers + pass by value). On the other hand, C++ also provides a native “pass by reference” solution which makes use of the idea of reference types. Thus, even in C++ if you are passing a pointer à la C, you are not truly passing by reference, you are passing a pointer by value (that is, of course, unless you are passing a reference to a pointer! e.g., int*&).
Because of this potential ambiguity in the term “pass by reference”, perhaps it’s best to only use it in the context of C++ when you are using a reference type.
But as you, and others have already noted, in the concept of passing anything via an argument, whether value or reference, that something must by definition have a value.
What is meant by pass by value is that the object itself is passed.
In pass by pointer, we pass the value of the pointer to the object.
In pass by reference, we pass a reference (basically a pointer that we know points to an object) in the same way.
So yes, we always pass a value, but the question is what is the value? Not always the object itself. But when we say pass a variable by **, we give the information relative to the object we want to pass, not the value actually passed.

Flexible programing for inverse function or root finding in Freepascal

I have a huge lib of math functions, like pdf or cdf of statistical distributions. But often e.g. the inverse cdf can be only calculated numerically, e.g. using Newton-Raphson or bisection, in the latter we would need to check if cdf(x) is > or < then the target y0.
However, many functions have further parameters like a Gaussian distribution having certain mean and sigma, so cdf is cdf(x,mean,sigma). Whereas other functions, such as standard normal cdf, have no further parameters, or some have even 3 or 4 further parameters.
A similar problem would happen if you want to apply bisection for either linear functions (2 parameters) or parabolas (3 parameters). Or if you want not the inverse function, but e.g. the integral of f.
The easiest implementation would be to define cdf as global function f(x); and to check for >y0 or global variables.
However, this is a very old-fashioned way, and Freepascal also supports procedural parameters, for calls like x=icdf(0.9987,#cdfStdNorm)
Even overloading is supported to allow calls like x2=icdf(0.9987,0,2,#cdfNorm) to pass also mean and sigma.
But this ends up still in two separate code blocks (even whole functions), because in one case we need to call cdf only with x, and in 2nd example also with mean and sigma.
Is there an elegant solution for this problem in Freepascal? Maybe using variant records? Or an object-oriented approach? I have no glue about OO, but I know the variant object style would require to change at least the headers of many functions because I want to apply the technique not only for inverse cdf calculation, but also to numerical integration, root finding, optimization, etc.
Or is it "best" just to define a real function type with e.g. x + 5 parameters (maybe as array), and to ignore the unused parameters? But for me it looks that then I would need many "wrapper" functions or to re-code all the existing functions (to use the arrays, even if they are sometimes not needed!).
Maybe macros can help as well? Any Freepascal hints are very welcome!
If you make it a (function .. of object), mean and sigma could be part of the class, and the function could internally just access it. Only the really changing parameters during the iteration would be parameters. (read: x)
Anonymous methods as talked about by David and Rudy is a further step to avoid having to declare a class for each such invocation, but that is convenience thing and IMHO not the core of the question. At the expense of declaring the class, your core code is free of global variable use and anonymous methods might also come with a performance cost, depending on usage.
Free Pascal also supports nested functions (function... is nested), which is the original Pascal closure-like way which was never adopted by Pascal compilers from Borland. A nested procedure passed as callback can access local variables in the procedure where it was declared. The Free Pascal numlib numeric math package uses this in some cases for similar cases like yours. For math it is even more natural.
Delphi never implements old constructs because borrowing syntax from other languages looks better on bulletlists and keeps the subscriptions flowing.

What are the distinctions between lexical and static scoping?

In R programing for those coming from other languages John Cook says that
R uses lexical scoping while S-PLUS uses static scope. The difference can be subtle, particularly when using closures.
I found this odd because I have always thought lexical scoping and static scoping where synonymous.
Are there distinct attributes to lexical and static scoping, or is this a distinction that changes from community to community, person to person? If so, what are the general camps and how do I tell them apart so I can better understand someones meaning when they use these words.
Wikipedia (and I) agree with you that the terms "lexical scope" and "static scope" are synonymous. This Lua discussion tries to make a distinction, but notes that people don't agree as to what that distinction is. :-)
It appears to me that the attempted distinction has to do with accessing names in a different function-activation-record ("stack block", if you will) than the most-current-execution record, which mainly (only?) occurs in nested functions:
function f:
var x
function h:
var y
use(y) -- obviously, accesses y in current activation of h
use(x) -- the question is, which x does this access?
With lexical scope, the answer is "the activation of f that called the activation of h" and with dynamic scope it means "the most recent activation that has any variable named x" (which might not be f). On the other hand, if the language forbids the use of x at all, there's no question about "which x is this" since the answer is "error". :-) It looks as though some people use "static scope" to refer to this third case.
R official documentation also addresses differences of scope between R and S-plus:
http://cran.r-project.org/doc/manuals/R-intro.html#Scope
The example given from the link can be simplified like this:
cube <- function(n) {
sq <- function() n*n
n*sq()
}
The results from S-Plus and R are different:
## first evaluation in S
S> cube(2)
Error in sq(): Object "n" not found
Dumped
S> n <- 3
S> cube(2)
[1] 18
## then the same function evaluated in R
R> cube(2)
[1] 8
I personally think the way of treating variable in R is more natural.

What is the difference between a function and a subroutine?

What is the difference between a function and a subroutine? I was told that the difference between a function and a subroutine is as follows:
A function takes parameters, works locally and does not alter any value or work with any value outside its scope (high cohesion). It also returns some value. A subroutine works directly with the values of the caller or code segment which invoked it and does not return values (low cohesion), i.e. branching some code to some other code in order to do some processing and come back.
Is this true? Or is there no difference, just two terms to denote one?
I disagree. If you pass a parameter by reference to a function, you would be able to modify that value outside the scope of the function. Furthermore, functions do not have to return a value. Consider void some_func() in C. So the premises in the OP are invalid.
In my mind, the difference between function and subroutine is semantic. That is to say some languages use different terminology.
A function returns a value whereas a subroutine does not. A function should not change the values of actual arguments whereas a subroutine could change them.
Thats my definition of them ;-)
If we talk in C, C++, Java and other related high level language:
a. A subroutine is a logical construct used in writing Algorithms (or flowcharts) to designate processing functionality in one place. The subroutine provides some output based on input where the processing may remain unchanged.
b. A function is a realization of the Subroutine concept in the programming language
Both function and subroutine return a value but while the function can not change the value of the arguments coming IN on its way OUT, a subroutine can. Also, you need to define a variable name for outgoing value, where as for function you only need to define the ingoing variables. For e.g., a function:
double multi(double x, double y)
{
double result;
result = x*y;
return(result)
}
will have only input arguments and won't need the output variable for the returning value. On the other hand same operation done through a subroutine will look like this:
double mult(double x, double y, double result)
{
result = x*y;
x=20;
y = 2;
return()
}
This will do the same as the function did, that is return the product of x and y but in this case you (1) you need to define result as a variable and (2) you can change the values of x and y on its way back.
One of the differences could be from the origin where the terminology comes from.
Subroutine is more of a computer architecture/organization terminology which means a reusable group of instructions which performs one task. It is is stored in memory once, but used as often as necessary.
Function got its origin from mathematical function where the basic idea is mapping a set of inputs to a set of permissible outputs with the property that each input is related to exactly one output.
In terms of Visual Basic a subroutine is a set of instructions that carries out a well defined task. The instructions are placed within Sub and End Sub statements.
Functions are similar to subroutines, except that the functions return a value. Subroutines perform a task but do not report anything to the calling program. A function commonly carries out some calculations and reports the result to the caller.
Based on Wikipedia subroutine definition:
In computer programming, a subroutine is a sequence of program
instructions that perform a specific task, packaged as a unit. This
unit can then be used in programs wherever that particular task should
be performed.
Subroutines may be defined within programs, or separately in libraries
that can be used by many programs. In different programming languages,
a subroutine may be called a procedure, a function, a routine, a
method, or a subprogram. The generic term callable unit is sometimes
used.
In Python, there is no distinction between subroutines and functions.
In VB/VB.NET function can return some result/data, and subroutine/sub can't.
In C# both subroutine and function referred to a method.
Sometimes in OOP the function that belongs to the class is called a method.
There is no more need to distinguish between function, subroutine and procedure because of hight level languages abstract that difference, so in the end, there is very little semantic difference between those two.
Yes, they are different, similar to what you mentioned.
A function has deterministic output and no side effects.
A subroutine does not have these restrictions.
A classic example of a function is int multiply(int a, int b)
It is deterministic as multiply(2, 3) will always give you 6.
It has no side effects because it does not modify any values outside its scope, including the values of a and b.
An example of a subroutine is void consume(Food sandwich)
It has no output so it is not a function.
It has side effects as calling this code will consume the sandwich and you can't call any operations on the same sandwich anymore.
You can think of a function as f(x) = y, or for the case of multiply, f(a, b) = c. Yes, this is programming and not math. But math models and begs to be used. So we use math in cs. If you are interested to know why the distinction between function and subroutine, you should check out functional programming. It works like magic.
From the view of the user, there is no difference between a programming function and a subroutine but in theory, there definitely is!
The concept itself is different between a subroutine and a function. Formally, the OP's definition is correct. Subroutines don't take arguments or give return values by formal semantics. That's just an interpretion with conventions. And variables in subroutines are accessible in other subroutines of the same file although this can be achieved as well in C with some difficulties.
Summary:
Subroutines work only based on side-effects, in the view of the programming language you are programming with. The concept itself has no explicit arguments or return values. You have to use side effects to simulate them.
Functions are mappings of input to output value(s) in the original sense, some kind of general substitution operation. In the adopted sense of the programming world, functions are an abstraction of subroutines with information about return value and arguments, inspired by mathematical functions. The additional formal abstraction differentiates a function from a subroutine in programming context.
Details:
The subroutine originally is simply a repeatable snippet of code which you can call in between other code. It originates in Assembly or Machine language programming and designates the instruction sequence itself. In the light of this meaning, Perl also uses the term subroutine for its callable code snippets.
Subroutines are concrete objects.
This is what I understood: the concept of a (pure) function is a mathematical concept which is a special case of mathematical relations with an own formal notation. You have an input or argument and it is defined what value is represented by the function with the given argument. The original function concept is entirely unrelated to instructions or calculations. Mathematical operations (or instructions in the programming world) only are a popular formal representation (description) of the actual mapping. The original function term itself is not defined as code. Calculations do not constitute the function, so that functions actually don't have any computational overhead because they are direct mappings. Function complexity considerations only arrived as there is an overhead to find the mapping.
Functions are abstract objects.
Now, since the whole PC-stuff is running on small machine instructions, the easiest way to model (or instantiate) mathematics is with a sequence of instructions itself. Computer Science has been founded by mathematicians (noteworthy: Alan Turing) and the first programming concepts are based on it so there is a need to bring mathematics into the machine. That's how I imagine the reason why "function" is the name of something which is implemented as subroutine and why the term "pure" function was coined to differentiate the original function concept from the overly broad term-use in programming languages.
Note: in Assembly Language Programming, it is typically said, that a subroutine has been passed arguments and gives a return value. This is an interpretation on top of the concrete formal semantics. Calling conventions specify the location where values, to be considered as arguments and return values, should be written to before calling a subroutine or returning. The call itself takes only a subroutine address, and has no formal arguments or return values.
PS: functions in programming languages don't necessarily need to be a subroutine (even though programming language terminology developed this way). Functions in functional programming languages can be constant variables, arrays or hash tables. Isn't every datastructure in ECMAScript a function?
The difference is isolation. A subroutine is just a piece of the program that begins with a label and ends with a go to. A function is outside the namespace of the rest of the program. It is like a separate program that can have the same variable names as used in the calling program, and whatever it does to them does not affect the state of those variables with the same name in the calling program.
From a coding perspective, the isolation means that you don’t have to use the variable names that are local to the function.
Sub double:
a = a + a
Return
fnDouble(whatever):
whatever = whatever + whatever
Return whatever
The subroutine works only on a. If you want to double b you have to set a = b before calling the subroutine. Then you may need to set a to null or zero after. Then when you want to double c you have to again set a to equal c.
Also the sub might have in it some other variable, z, that is changed when the sub is jumped to, which is a bit dangerous.
The essential is isolation of names to the function (unless declared global in the function.)
I am writing this answer from a VBA for excel perspective. If you are writing a function then you can use it as an expression i. e. you can call it from any cell in excel.
eg: normal vlookup function in excel cannot look up values > 256 characters. So I used this function:
Function MyVlookup(Lval As Range, c As Range, oset As Long) As Variant
Dim cl As Range
For Each cl In c.Columns(1).Cells
If UCase(Lval) = UCase(cl) Then
MyVlookup = cl.Offset(, oset - 1)
Exit Function
End If
Next
End Function
This is not my code. Got it from another internet post. It works fine.
But the real advantage is I can now call it from any cell in excel. If wrote a subroutine I couldn't do that.
Every subroutine performs some specific task. For some subroutines, that task is to compute or retrieve some data value. Subroutines of this type are called functions. We say that a function returns a value. Generally, the returned value is meant to be used somehow in the program that calls the function.

Is there a relationship between calling a function and instantiating an object in pure functional languages?

Imagine a simple (made up) language where functions look like:
function f(a, b) = c + 42
where c = a * b
(Say it's a subset of Lisp that includes 'defun' and 'let'.)
Also imagine that it includes immutable objects that look like:
struct s(a, b, c = a * b)
Again analogizing to Lisp (this time a superset), say a struct definition like that would generate functions for:
make-s(a, b)
s-a(s)
s-b(s)
s-c(s)
Now, given the simple set up, it seems clear that there is a lot of similarity between what happens behind the scenes when you either call 'f' or 'make-s'. Once 'a' and 'b' are supplied at call/instantiate time, there is enough information to compute 'c'.
You could think of instantiating a struct as being like a calling a function, and then storing the resulting symbolic environment for later use when the generated accessor functions are called. Or you could think of a evaluting a function as being like creating a hidden struct and then using it as the symbolic environment with which to evaluate the final result expression.
Is my toy model so oversimplified that it's useless? Or is it actually a helpful way to think about how real languages work? Are there any real languages/implementations that someone without a CS background but with an interest in programming languages (i.e. me) should learn more about in order to explore this concept?
Thanks.
EDIT: Thanks for the answers so far. To elaborate a little, I guess what I'm wondering is if there are any real languages where it's the case that people learning the language are told e.g. "you should think of objects as being essentially closures". Or if there are any real language implementations where it's the case that instantiating an object and calling a function actually share some common (non-trivial, i.e. not just library calls) code or data structures.
Does the analogy I'm making, which I know others have made before, go any deeper than mere analogy in any real situations?
You can't get much purer than lambda calculus: http://en.wikipedia.org/wiki/Lambda_calculus. Lambda calculus is in fact so pure, it only has functions!
A standard way of implementing a pair in lambda calculus is like so:
pair = fn a: fn b: fn x: x a b
first = fn a: fn b: a
second = fn a: fn b: b
So pair a b, what you might call a "struct", is actually a function (fn x: x a b). But it's a special type of function called a closure. A closure is essentially a function (fn x: x a b) plus values for all of the "free" variables (in this case, a and b).
So yes, instantiating a "struct" is like calling a function, but more importantly, the actual "struct" itself is like a special type of function (a closure).
If you think about how you would implement a lambda calculus interpreter, you can see the symmetry from the other side: you could implement a closure as an expression plus a struct containing the values of all the free variables.
Sorry if this is all obvious and you just wanted some real world example...
Both f and make-s are functions, but the resemblance doesn't go much further. Applying f calls the function and executes its code; applying make-s creates a structure.
In most language implementations and modelizations, make-s is a different kind of object from f: f is a closure, whereas make-s is a constructor (in the functional languages and logic meaning, which is close to the object oriented languages meaning).
If you like to think in an object-oriented way, both f and make-s have an apply method, but they have completely different implementations of this method.
If you like to think in terms of the underlying logic, f and make-s have a type build on the samme type constructor (the function type constructor), but they are constructed in different ways and have different destruction rules (function application vs. constructor application).
If you'd like to understand that last paragraph, I recommend Types and Programming Languages by Benjamin C. Pierce. Structures are discussed in §11.8.
Is my toy model so oversimplified that it's useless?
Essentially, yes. Your simplified model basically boils down to saying that each of these operations involves performing a computation and putting the result somewhere. But that is so general, it covers anything that a computer does. If you didn't perform a computation, you wouldn't be doing anything useful. If you didn't put the result somewhere, you would have done work for nothing as you have no way to get the result. So anything useful you do with a computer, from adding two registers together, to fetching a web page, could be modeled as performing a computation and putting the result somewhere that it can be accessed later.
There is a relationship between objects and closures. http://people.csail.mit.edu/gregs/ll1-discuss-archive-html/msg03277.html
The following creates what some might call a function, and others might call an object:
Taken from SICP ( http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-21.html )
(define (make-account balance)
(define (withdraw amount)
(if (>= balance amount)
(begin (set! balance (- balance amount))
balance)
"Insufficient funds"))
(define (deposit amount)
(set! balance (+ balance amount))
balance)
(define (dispatch m)
(cond ((eq? m 'withdraw) withdraw)
((eq? m 'deposit) deposit)
(else (error "Unknown request -- MAKE-ACCOUNT"
m))))
dispatch)