Statements and state - language-agnostic

Is there any deeper meaning in the fact that the word "statement" starts with the word "state", or is that just a curious coincidence? Note that english is not my native language, so the answer might be obvious to you, but not me ;)

The English word "statement" does derive from the verb form of "state", but the programming terms "state" and "statement" aren't related to each other in any meaningful way. A statement in a programming language is merely a syntactic construct and does not imply that any state is involved.

Etymologically, the answer is yes, as Christopher pointed out.
However, I would argue that in programming, too, there is a connection. Statements (as opposed to expressions, which are syntactic elements which may be evaluated for their value) are syntactic elements representing imperative commands. (You may hear that in C, many statements (ex. assignment) are also expressions.)
As such, statements necessarily involve some change in the state of the program (ex x := 5), or the imposition of some control flow (ex GOTO 10).
You will note that a purely functional language (say, Haskell), contains no statements, but only expressions.

It derives from the verb sense of "state".

The key concept of imperative programming paradigm is that programs are represented by the sequence of statements one of which change the program state.
So you change the state or you can say you actually state (declare) something with each statement. For me it seems that -ment suffix is used here in its third sense ("The means or instrument of an action"). Statements are instruments to state something in imperative program.
In universal usage it seems that -ment suffix is used in its first sense ("An action, process, or skill").

Related

Boolean expressions with unused variables?

I'm auditing a knowledge base, and have noticed some Boolean expressions like:
A and (A or B)
This Boolean expression is notable because, of course, the value of B is actually irrelevant. This may point to an error in the knowledge base, since the creator of the expression presumably intended for it to consider all variables it contains.
I have two questions:
Is there a name for this phenomenon?
Are there any efficient algorithms for identifying these unused/redundant variables? I can do it using truth tables, but this breaks down for long expressions. I've also looked at minimization algorithms, but many of them don't guarantee to find an optimal solution, so I'm not sure they're guaranteed to identify any unused variables.
You are looking at an Absorption
Those are known as logical equivalences and lists of well known and prooved equivalences exist. Algorithms may be similar to those used by programming language's interpreters or compilers and they are usually based on trees. You can use a tree to determine the type of expression that you are looking to and then match it to its corresponding logical equivalence. It's important that you also read about the precedence of your operators.
Links
Logical equivalences
Order of precedence
Syntax trees
Parse trees

Does any programming language support defining constraints on primitive data types?

Last night I was thining that programming languages can have a feature in which we should be able to constraints the values assigned to primitive data types.
For example I should be able to say my variable of type int can only have value between 0 and 100
int<0, 100> progress;
This would then act as a normal integer in all scenarios except the fact that you won't be able to specify values out of the range defined in constraint. The compiler will not compile the code progress=200.
This constraint can be carried over with type information.
Is this possible? Is it done in any programming language? If yes then which language has it and what is this technique called?
It is generally not possible. It makes little sense to use integers without any arithmetic operators. With arithmetic operators you have this:
int<0,100> x, u, v;
...
x = u + v; // is it in range?
If you're willing to do checks at run-time, then yes, several mainstream languages support it, starting with Pascal.
I believe Pascal (and Delphi) offers something similar with subrange types.
I think this is not possible at all in Java and in Ruby (well, in Ruby probably it is possible, but requires some effort). I have no idea about other languages, though.
Ada allows something like what you describe with ranges:
type My_Int is range 1..100;
So if you try assign a value to a My_Int that's less than 1 or greater than 100, Ada will raise the exception Constraint_Error.
Note that I've never used Ada. I've only read about this feature, so do your research before you plunge in.
It is certainly possible. There are many different techniques to do that, but 'dependent types' is the most popular.
The constraints can be even checked statically at compile time by compiler. See, for example, Agda2 and ATS (ats-lang.org).
Weaker forms of your 'range types' are possible without full dependent types, I think.
Some keywords to search for research papers:
- Guarded types
- Refinment types
- Subrange types
Certainly! In case you missed it: C. Do you C? You don't C? You don't count short as a constraint on Integer? Ok, so C only gives you pre-packaged constrained types.
BTW: It seems the answer that Pascal has subrange types misses the point of them. In Pascal array bounds violations are not possible. This is because the array index must of the same type as the array was declared with. In turn this means that to use an integer index you must coerce it down to the subrange, and that is where the run time check is done, not accessing the array.
This is a very important idea because it means a for loop over an array index type may access the array components safely without any run time checking.
Pascal has subranges. Ada extended that a bit, so you can do something like a subrange, or you can create an entirely new type with characteristics of the existing type, but not compatible with it (e.g., even if it was in the right range, you wouldn't be able to assign an Integer to your new type based off of Integer).
C++ doesn't support the idea directly, but is flexible enough that you can implement it if you want to. If you decide to support all the compound assignment operators (+=, -=, *=, etc.) this can be a lot of work though.
Other languages that support operator overloading (e.g., ML and company) can probably support it in much the same way as C++.
Also note that there are a few non-trivial decisions involved in the design. In particular, if the type is used in a way that could/does result in an intermediate result that overflows the specified range, but produces a final result that's within the specified range, what do you want to happen? Depending on your situation, that might be an error, or it might be entirely acceptable, and you'll have to decide which.
I really doubt that you can do that. Afterall these are primitive datatypes, with emphasis on primitive!
adding a constraint will make the type a subclass of its primitive state, thus extending it.
from wikipedia:
a basic type is a data type provided by a programming language as a basic building block. Most languages allow more complicated composite types to be recursively constructed starting from basic types.
a built-in type is a data type for which the programming language provides built-in support.
So personally, even if it is possible i wouldnt do, since its a bad practice. Instead just create an object that returns this type and the constraints (which i am sure you thought of this solution).
SQL has domains, which consist of a base type together with a dynamically-checked constraint. For example, you might define the domain telephone_number as a character string with an appropriate number of digits, valid area code, etc.

Why would a language NOT use Short-circuit evaluation?

Why would a language NOT use Short-circuit evaluation? Are there any benefits of not using it?
I see that it could lead to some performances issues... is that true? Why?
Related question : Benefits of using short-circuit evaluation
Reasons NOT to use short-circuit evaluation:
Because it will behave differently and produce different results if your functions, property Gets or operator methods have side-effects. And this may conflict with: A) Language Standards, B) previous versions of your language, or C) the default assumptions of your languages typical users. These are the reasons that VB has for not short-circuiting.
Because you may want the compiler to have the freedom to reorder and prune expressions, operators and sub-expressions as it sees fit, rather than in the order that the user typed them in. These are the reasons that SQL has for not short-circuiting (or at least not in the way that most developers coming to SQL think it would). Thus SQL (and some other languages) may short-circuit, but only if it decides to and not necessarily in the order that you implicitly specified.
I am assuming here that you are asking about "automatic, implicit order-specific short-circuiting", which is what most developers expect from C,C++,C#,Java, etc. Both VB and SQL have ways to explicitly force order-specific short-circuiting. However, usually when people ask this question it's a "Do What I Meant" question; that is, they mean "why doesn't it Do What I Want?", as in, automatically short-circuit in the order that I wrote it.
One benefit I can think of is that some operations might have side-effects that you might expect to happen.
Example:
if (true || someBooleanFunctionWithSideEffect()) {
...
}
But that's typically frowned upon.
Ada does not do it by default. In order to force short-circuit evaluation, you have to use and then or or else instead of and or or.
The issue is that there are some circumstances where it actually slows things down. If the second condition is quick to calculate and the first condition is almost always true for "and" or false for "or", then the extra check-branch instruction is kind of a waste. However, I understand that with modern processors with branch predictors, this isn't so much the case. Another issue is that the compiler may happen to know that the second half is cheaper or likely to fail, and may want to reorder the check accordingly (which it couldn't do if short-circuit behavior is defined).
I've heard objections that it can lead to unexpected behavior of the code in the case where the second test has side effects. IMHO it is only "unexpected" if you don't know your language very well, but some will argue this.
In case you are interested in what actual language designers have to say about this issue, here's an excerpt from the Ada 83 (original language) Rationale:
The operands of a boolean expression
such as A and B can be evaluated in
any order. Depending on the complexity
of the term B, it may be more
efficient (on some but not all
machines) to evaluate B only when the
term A has the value TRUE. This
however is an optimization decision
taken by the compiler and it would be
incorrect to assume that this
optimization is always done. In other
situations we may want to express a
conjunction of conditions where each
condition should be evaluated (has
meaning) only if the previous
condition is satisfied. Both of these
things may be done with short-circuit
control forms ...
In Algol 60 one can achieve the effect
of short-circuit evaluation only by
use of conditional expressions, since
complete evaluation is performed
otherwise. This often leads to
constructs that are tedious to follow...
Several languages do not define how
boolean conditions are to be
evaluated. As a consequence programs
based on short-circuit evaluation will
not be portable. This clearly
illustrates the need to separate
boolean operators from short-circuit
control forms.
Look at my example at On SQL Server boolean operator short-circuit which shows why a certain access path in SQL is more efficient if boolean short circuit is not used. My blog example it shows how actually relying on boolean short-circuit can break your code if you assume short-circuit in SQL, but if you read the reasoning why is SQL evaluating the right hand side first, you'll see that is correct and this result in a much improved access path.
Bill has alluded to a valid reason not to use short-circuiting but to spell it in more detail: highly parallel architectures sometimes have problem with branching control paths.
Take NVIDIA’s CUDA architecture for example. The graphics chips use an SIMT architecture which means that the same code is executed on many parallel threads. However, this only works if all threads take the same conditional branch every time. If different threads take different code paths, evaluation is serialized – which means that the advantage of parallelization is lost, because some of the threads have to wait while others execute the alternative code branch.
Short-circuiting actually involves branching the code so short-circuit operations may be harmful on SIMT architectures like CUDA.
– But like Bill said, that’s a hardware consideration. As far as languages go, I’d answer your question with a resounding no: preventing short-circuiting does not make sense.
I'd say 99 times out of 100 I would prefer the short-circuiting operators for performance.
But there are two big reasons I've found where I won't use them.
(By the way, my examples are in C where && and || are short-circuiting and & and | are not.)
1.) When you want to call two or more functions in an if statement regardless of the value returned by the first.
if (isABC() || isXYZ()) // short-circuiting logical operator
//do stuff;
In that case isXYZ() is only called if isABC() returns false. But you may want isXYZ() to be called no matter what.
So instead you do this:
if (isABC() | isXYZ()) // non-short-circuiting bitwise operator
//do stuff;
2.) When you're performing boolean math with integers.
myNumber = i && 8; // short-circuiting logical operator
is not necessarily the same as:
myNumber = i & 8; // non-short-circuiting bitwise operator
In this situation you can actually get different results because the short-circuiting operator won't necessarily evaluate the entire expression. And that makes it basically useless for boolean math. So in this case I'd use the non-short-circuiting (bitwise) operators instead.
Like I was hinting at, these two scenarios really are rare for me. But you can see there are real programming reasons for both types of operators. And luckily most of the popular languages today have both. Even VB.NET has the AndAlso and OrElse short-circuiting operators. If a language today doesn't have both I'd say it's behind the times and really limits the programmer.
If you wanted the right hand side to be evaluated:
if( x < 13 | ++y > 10 )
printf("do something\n");
Perhaps you wanted y to be incremented whether or not x < 13. A good argument against doing this, however, is that creating conditions without side effects is usually better programming practice.
As a stretch:
If you wanted a language to be super secure (at the cost of awesomeness), you would remove short circuit eval. When something 'secure' takes a variable amount of time to happen, a Timing Attack could be used to mess with it. Short circuit eval results in things taking different times to execute, hence poking the hole for the attack. In this case, not even allowing short circuit eval would hopefully help write more secure algorithms (wrt timing attacks anyway).
The Ada programming language supported both boolean operators that did not short circuit (AND, OR), to allow a compiler to optimize and possibly parallelize the constructs, and operators with explicit request for short circuit (AND THEN, OR ELSE) when that's what the programmer desires. The downside to such a dual-pronged approach is to make the language a bit more complex (1000 design decisions taken in the same "let's do both!" vein will make a programming language a LOT more complex overall;-).
Not that I think this is what's going on in any language now, but it would be rather interesting to feed both sides of an operation to different threads. Most operands could be pre-determined not to interfere with each other, so they would be good candidates for handing off to different CPUs.
This kins of thing matters on highly parallel CPUs that tend to evaluate multiple branches and choose one.
Hey, it's a bit of a stretch but you asked "Why would a language"... not "Why does a language".
The language Lustre does not use short-circuit evaluation. In if-then-elses, both then and else branches are evaluated at each tick, and one is considered the result of the conditional depending on the evaluation of the condition.
The reason is that this language, and other synchronous dataflow languages, have a concise syntax to speak of the past. Each branch needs to be computed so that the past of each is available if it becomes necessary in future cycles. The language is supposed to be functional, so that wouldn't matter, but you may call C functions from it (and perhaps notice they are called more often than you thought).
In Lustre, writing the equivalent of
if (y <> 0) then 100/y else 100
is a typical beginner mistake. The division by zero is not avoided, because the expression 100/y is evaluated even on cycles when y=0.
Because short-circuiting can change the behavior of an application IE:
if(!SomeMethodThatChangesState() || !SomeOtherMethodThatChangesState())
I'd say it's valid for readability issues; if someone takes advantage of short circuit evaluation in a not fully obvious way, it can be hard for a maintainer to look at the same code and understand the logic.
If memory serves, erlang provides two constructs, standard and/or, then andalso/orelse . This clarifies intend that 'yes, I know this is short circuiting, and you should too', where as at other points the intent needs to be derived from code.
As an example, say a maintainer comes across these lines:
if(user.inDatabase() || user.insertInDatabase())
user.DoCoolStuff();
It takes a few seconds to recognize that the intent is "if the user isn't in the Database, insert him/her/it; if that works do cool stuff".
As others have pointed out, this is really only relevant when doing things with side effects.
I don't know about any performance issues, but one possible argumentation to avoid it (or at least excessive use of it) is that it may confuse other developers.
There are already great responses about the side-effect issue, but I didn't see anything about the performance aspect of the question.
If you do not allow short-circuit evaluation, the performance issue is that both sides must be evaluated even though it will not change the outcome. This is usually a non-issue, but may become relevant under one of these two circumstances:
The code is in an inner loop that is called very frequently
There is a high cost associated with evaluating the expressions (perhaps IO or an expensive computation)
The short-circuit evaluation automatically provides conditional evaluation of a part of the expression.
The main advantage is that it simplifies the expression.
The performance could be improved but you could also observe a penalty for very simple expressions.
Another consequence is that side effects of the evaluation of the expression could be affected.
In general, relying on side-effect is not a good practice, but in some specific context, it could be the preferred solution.
VB6 doesn't use short-circuit evaluation, I don't know if newer versions do, but I doubt it. I believe this is just because older versions didn't either, and because most of the people who used VB6 wouldn't expect that to happen, and it would lead to confusion.
This is just one of the things that made it extremely hard for me to get out of being a noob VB programmer who wrote spaghetti code, and get on with my journey to be a real programmer.
Many answers have talked about side-effects. Here's a Python example without side-effects in which (in my opinion) short-circuiting improves readability.
for i in range(len(myarray)):
if myarray[i]>5 or (i>0 and myarray[i-1]>5):
print "At index",i,"either arr[i] or arr[i-1] is big"
The short-circuit ensures we don't try to access myarray[-1], which would raise an exception since Python arrays start at 0. The code could of course be written without short-circuits, e.g.
for i in range(len(myarray)):
if myarray[i]<=5: continue
if i==0: continue
if myarray[i-1]<=5: continue
print "At index",i,...
but I think the short-circuit version is more readable.

What is the purpose of null?

I am in a compilers class and we are tasked with creating our own language, from scratch. Currently our dilemma is whether to include a 'null' type or not. What purpose does null provide? Some of our team is arguing that it is not strictly necessary, while others are pro-null just for the extra flexibility it can provide.
Do you have any thoughts, especially for or against null?
Have you ever created functionality that required null?
Null: The Billion Dollar Mistake. Tony Hoare:
I call it my billion-dollar mistake.
It was the invention of the null
reference in 1965. At that time, I was
designing the first comprehensive type
system for references in an object
oriented language (ALGOL W). My goal
was to ensure that all use of
references should be absolutely safe,
with checking performed automatically
by the compiler. But I couldn't resist
the temptation to put in a null
reference, simply because it was so
easy to implement. This has led to
innumerable errors, vulnerabilities,
and system crashes, which have
probably caused a billion dollars of
pain and damage in the last forty
years. In recent years, a number of
program analysers like PREfix and
PREfast in Microsoft have been used to
check references, and give warnings if
there is a risk they may be non-null.
More recent programming languages like
Spec# have introduced declarations for
non-null references. This is the
solution, which I rejected in 1965.
null is a sentinel value that is not an integer, not a string, not a boolean - not anything really, except something to hold and be a "not there" value. Don't treat it as or expect it to be a 0, or an empty string or an empty list. Those are all valid values and can be geniunely valid values in many circumstances - the idea of a null instead means there is no value there.
Perhaps it's a little bit like a function throwing an exception instead of returning a value. Except instead of manufacturing and returning an ordinary value with a special meaning, it returns a special value that already has a special meaning. If a language expects you to work with null, then you can't really ignore it.
Oh no, I feel the philosophy major coming out of me....
The notion of NULL comes from the notion of the empty set in set theory. Nearly everyone agrees that the empty set is not equal to zero. Mathematicians and philosophers have been battling about the value of set theory for decades.
In programming languages, I think it is very helpful to understand object references that do not refer to anything in memory. Google about set theory and you will see similarities between the formal symbolic systems (notation) that set theorists use and symbols we use in many computer languages.
Regards,
Sam
What's null for you ask?
Well,
Nothing.
I usually think of 'null' in the C/C++ aspect of 'memory address 0'. It's not strictly needed, but if it didn't exist, then people would just use something else (if myNumber == -1, or if myString == "").
All I know is, I can't think of a day I've spent coding that I haven't typed the word "null", so I think that makes it pretty important.
In the .NET world, MS recently added nullable types for int, long, etc that never used to be nullable, so I guess they think its pretty important too.
If I was designing a lanaguage, I would keep it. However I wouldnt avoid using a language that didn't have null either. It would just take a little getting used too.
the concept of null is not strictly necessary in exactly the same sense that the concept of zero is not strictly necessary.
I don't think it's helpful to talk about null outside the context of the whole language design. First point of confusion: is the null type empty, or does it include a single, distinguished value (often called "nil")? A completely empty type is not very useful---although C uses the empty return type void to mark a procedure that is executed only for side effect, many other languages use a singleton type (usually the empty tuple) for this purpose.
I find that a nil value is used most effectively in dynamically typed languages. In Smalltalk it is the value used when you need a value but you don't have any information. In Lua it is used even more effectively: the nil value is the only value that cannot be a key or a value in a Lua table. In Lua, nil is also used as the value of missing parameters or results.
Overall I would say that a nil value can be useful in a dynamically typed setting, but in a statically typed setting, a null type is useful only for talking about functions (or procedures or methods) that are executed for side effect.
At all costs, avoid the NULL pointer used in C and Java. These are artifacts inherent in the implementations of pointers and objects, and in a well designed lanugage they should not be allowed. By all means give your users a way to extend an existing type with a null value, but make them do it explicitly, on purpose---don't force every type to have one by accident. (As an example of explicit use, I recently implemented Bentley and Sedgewick's ternary search trees in Haskell, and I needed to extend the character type with one additional value meaning 'not a character'. For this purpose Haskell provides the Maybe type.)
Finally, if you are writing a compiler, it is good to remember that the easiest parts of the language to compile, and the parts that cause the fewest bugs, are the parts that aren't there :-)
It seems useful to have a way to indicate a reference or pointer that isn't currently pointing at anything, whether you call it null, nil, None, etc. If for no other reason to let people know when they're about to fall off the end of a linked list.
In C NULL was (void*(0)), so it was a type with value(?). But that didn't work with C++ templates so C++ made NULL 0, it dropped the type and became a pure value.
However it was found that having a specific NULL type would be better so they (the C++ committee) decided that NULL will once again become a type (in C++0x).
Also almost every language besides C++ has NULL as a type, or an equivalent unique value not the same as 0 (it might be equal to it or not, but its not the same value).
So now even C++ will use NULL as a type, basically closing the discussions on the matter, since now everyone (almost) will have a NULL type
Edit: Thinking about it Haskell's maybe is another solution to NULL types, but its not as easy to grasp or implement.
A practical example of null is when you ask a yes/no question and don't get a response. You don't want to default to no because it might be important to know that the question wasn't answered in situations where the answer is very important.
Null is not a mistake.
Null means "I don't know yet"
For primitives you don't really need a null (I have to say that strings (in .NET) shouldn't get it IMHO)
But for composite entities it definitely serves a purpose.
You can think of any type as a set along with a collection of operations. There are many cases where it's convenient to have a value with isn't a "normal" value; for example, consider an "EOF" value. for C's getline(). You can handle that in one of several ways: you can have a NULL value outside the set, you can distinguish a particular value as null (in C, ((void *)0) can serve that purpose) or you can have a way of creating a new type, so that for type T, you create a type T' =def { T ∪ NULL }, which is the way Haskell does it (a "Maybe" type).
Which one is better is good for lots of enjoyable argument.
Null is only useful in situations where there are variables with unassigned values. If every variable has a value, then there is no need for null values.
Null is a sentinel value. It's a value that cannot possibly be real data and instead provides meta-data about the variable in use.
Null assigned to a pointer indicates that the pointer is uninitialized. This gives you the ability to detect misuse of uninitialized pointers by detecting dereferences of null valued pointers. If you instead leave the value of a pointer equal to whatever happened to be in memory then you would have crazily irregular program behavior that would be much more difficult to debug.
Also, the null character in a C-style variable length string is used to mark the end of the string.
The use of null in these ways, especially for pointer values, has become so popular that the metaphor has been imported into other systems, even when the "null" sentinel value is implemented entirely differently and has no relation to the number 0.
Null is not the problem - everyone treating, and interpreting null differently is the problem.
I like null. If there was no null, null would only be replaced with some other way for the code to say "I have no clue, dude!" (which some would write "I have no clue, man!", or "I have not a clue, old bean!" etc. and so, we'd have the exact same problems again).
I generalize, I know.
Consider the examples of C and of Java, for example. In C, the convention is that a null pointer is the numeric value zero. Of course, that's really just a convention: nothing about the language treats that value as anything special. In Java, however, null is a distinct concept that you can detect and know that, yes, this is in fact a bad reference and I shouldn't try to open that door to see what's on the other side.
Even so, I hate nulls almost worse than anything else.
CLARIFICATION based on comments: I hate the defacto null pointer value of zero worse than I hate null.
Any time I see an assignment to null, I think, "oh good, someone has just put a landmine in the code. Someday, we're going to be walking down a related execution path and BOOM! NullPointerException!"
What I would prefer is for someone to specify a useful default or NullObject that lets me know that "this parameter has not been set to anything useful." A bald null by itself is just trouble waiting to happen.
That said, it's still better than a raw zero wandering around loose.
That decision depends on the objective of the programing language.
Who are you designing the programing language for? Are you designing it for people who are familiar with c-derived languages? If so, then you should probably add support for null.
In general, I would say that you should avoid violating people's expectations unless it serves a particular purpose.
Take switch-blocks in C# as an example. All case labels in C# must have an explicit control-flow expression in every branch. That is they must all end with either a "break" statement or an explicit goto. That means that while this code is legal:
switch(x)
{
case 1:
case 2:
foo;
break;
}
That this code would not be legal:
switch (x)
{
case 1:
foo();
case 2:
bar();
break;
}
In order to create a "fall through" from case 1 to case 2, it's necessary to insert a goto, like this:
switch (x)
{
case 1:
foo();
goto case 2;
case 2:
bar();
break;
}
This is arguably something that would violate the expectations of C++ programmers who are leaning C#. However, adding that restriction serves a purpose. It eliminates the possibility of an entire class of common C++ bugs. It adds to the learning curve of the language slightly, but the result is a net benefit to the programmer.
If your goal is to design a language targeted at C++ programmers, then removing null would probably violate their expectations. That will cause confusion, and make your language more difficult to learn. The key question is then, "what benefit do they get"? Or, alternatively, "what detriment does this cause".
If you are simply trying to design a "super small language" that can be implemented in the course of a single semester, then the story is different. In that case your objective isn't to be build a useful language targeted at a particular segment of the population. Instead, it's just to learn how to create a compiler. In that scenario, having a smaller language is a big benefit, and so it's worth eliminating null.
So, to recap, I would say that you should:
Identify your goals in creating the language. Who is the language designed for, and what are their needs.
Make the decision based on what helps the target users meet their goals in the best way.
Usually this will make the desired result pretty clear.
Of course, if you don't explicitly articulate your design goals, or you can't agree on what they are, then you are still going to argue. In that case, however, you are pretty much doomed anyways.
One other way to look at null is that it's a performance issue. If you have a complex object containing other complex objects and so on, then it is more efficient to allow for all properties to initially become null instead of creating some kind of empty objects that won't be good for nothing and soon to be replaced.
That's just one perspective that I can't see mentioned before.
What purpose does null provide?
I believe there are two concepts of null at work here.
The first (null the logical indicator) is a conventional program language mechanism that provides runtime indication of a non-initialized memory reference in program logic.
The second (null the value) is a base data value that can be used in logical expressions to detect the logical null indicator (the previous definition) and make logical decisions in program code.
Do you have any thoughts, especially for or against null?
While null has been the bane of many programmers and the source of many application faults over the years, the null concept has validity. If you and your team create a language that uses memory references that can be potentially misused because the reference was not initialized, you will likely need a mechanism to detect that eventuality. It is always an option to create an alternative, but null is a widely known alternative.
Bottom line, it all depends upon the goals of your language:
target programming audience
robustness
performance
etc...
If robustness and program correctness are high on your priority list AND you allow programmatic memory references, you will want to consider null.
BB
If you are creating a statically typed language, I imagine that null could add a good deal of complexity to your compiler.
If you are creating a dynamically typed language, NULL can come in quite handy, as it is just another "type" without any variations.
Null is a placeholder that means that no value (append "of the correct type" for a static-typed language) can be assigned to that variable.
There is cognitive dissonance here. I heard somewhere else that humans cannot comprehend negation, because they must suppose a value and then imagine its unfitness.
My suggestion to your team is: come up with some examples programs that need to be written in your language, and see how they would look if you left out null, versus if you included it.
Use a null object pattern!
If you language is object oriented, let it have an UndefinedValue class of which only one singleton instance exists. Then use this instance wherever null is used. This has the advantage that your null will respond to messages such as #toString and #equals. You will never run into a null pointer exception as in Java. (Of course, this requires that your language is dynamically typed).
Null provides an easy way out for programmers who haven't completely thought through the logic and domains needed by their program, or the future maintenance implications of using a value with essentially no clear and agreed upon definition.
It may seem obvious at first that it must mean "no value", but what that ACTUALLY means depends on context. If, for instance LastName === null, does that mean that person doesn't have a last name, or that we don't know what their last name is, or that it hasn't be entered into the system yet? Does null equal itself, or doesn't it? In SQL it does not. In many languages it does. But if we don't know the value of personA.lastName, or personB.lastName, how can we know that personA.lastName === personB.lastName, eh? Should the result be false, or .. . null?
It depends on what you're doing, which is why it's dangerous and silly to have some kind of system wide value that can be used for any kind of situation that kind of looks like "nothing", since how other parts of your program and external libraries or modules can't really be depended upon to correctly interpret what you meant by "null".
You're much better off clearly defining the DOMAIN of possible values of lastName, and exactly what every possible value actually means, rather than depending on some vague systemwide notion of null, which may or may not have any relevance to what you're doing, depending on which language you're using, and what you're trying to do. A value, which may in fact, behave in exactly the wrong way when you begin to operate on your data.
Null is to objects what 0 is to numbers.

Functional, Declarative, and Imperative Programming [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
What do the terms functional, declarative, and imperative programming mean?
At the time of writing this, the top voted answers on this page are imprecise and muddled on the declarative vs. imperative definition, including the answer that quotes Wikipedia. Some answers are conflating the terms in different ways.
Refer also to my explanation of why spreadsheet programming is declarative, regardless that the formulas mutate the cells.
Also, several answers claim that functional programming must be a subset of declarative. On that point it depends if we differentiate "function" from "procedure". Lets handle imperative vs. declarative first.
Definition of declarative expression
The only attribute that can possibly differentiate a declarative expression from an imperative expression is the referential transparency (RT) of its sub-expressions. All other attributes are either shared between both types of expressions, or derived from the RT.
A 100% declarative language (i.e. one in which every possible expression is RT) does not (among other RT requirements) allow the mutation of stored values, e.g. HTML and most of Haskell.
Definition of RT expression
RT is often referred to as having "no side-effects". The term effects does not have a precise definition, so some people don't agree that "no side-effects" is the same as RT. RT has a precise definition:
An expression e is referentially transparent if for all programs p every occurrence of e in p can be replaced with the result of evaluating e, without affecting the observable result of p.
Since every sub-expression is conceptually a function call, RT requires that the implementation of a function (i.e. the expression(s) inside the called function) may not access the mutable state that is external to the function (accessing the mutable local state is allowed). Put simply, the function (implementation) should be pure.
Definition of pure function
A pure function is often said to have "no side-effects". The term effects does not have a precise definition, so some people don't agree.
Pure functions have the following attributes.
the only observable output is the return value.
the only output dependency is the arguments.
arguments are fully determined before any output is generated.
Remember that RT applies to expressions (which includes function calls) and purity applies to (implementations of) functions.
An obscure example of impure functions that make RT expressions is concurrency, but this is because the purity is broken at the interrupt abstraction layer. You don't really need to know this. To make RT expressions, you call pure functions.
Derivative attributes of RT
Any other attribute cited for declarative programming, e.g. the citation from 1999 used by Wikipedia, either derives from RT, or is shared with imperative programming. Thus proving that my precise definition is correct.
Note, immutability of external values is a subset of the requirements for RT.
Declarative languages don't have looping control structures, e.g. for and while, because due to immutability, the loop condition would never change.
Declarative languages don't express control-flow other than nested function order (a.k.a logical dependencies), because due to immutability, other choices of evaluation order do not change the result (see below).
Declarative languages express logical "steps" (i.e. the nested RT function call order), but whether each function call is a higher level semantic (i.e. "what to do") is not a requirement of declarative programming. The distinction from imperative is that due to immutability (i.e. more generally RT), these "steps" cannot depend on mutable state, rather only the relational order of the expressed logic (i.e. the order of nesting of the function calls, a.k.a. sub-expressions).
For example, the HTML paragraph <p> cannot be displayed until the sub-expressions (i.e. tags) in the paragraph have been evaluated. There is no mutable state, only an order dependency due to the logical relationship of tag hierarchy (nesting of sub-expressions, which are analogously nested function calls).
Thus there is the derivative attribute of immutability (more generally RT), that declarative expressions, express only the logical relationships of the constituent parts (i.e. of the sub-expression function arguments) and not mutable state relationships.
Evaluation order
The choice of evaluation order of sub-expressions can only give a varying result when any of the function calls are not RT (i.e. the function is not pure), e.g. some mutable state external to a function is accessed within the function.
For example, given some nested expressions, e.g. f( g(a, b), h(c, d) ), eager and lazy evaluation of the function arguments will give the same results if the functions f, g, and h are pure.
Whereas, if the functions f, g, and h are not pure, then the choice of evaluation order can give a different result.
Note, nested expressions are conceptually nested functions, since expression operators are just function calls masquerading as unary prefix, unary postfix, or binary infix notation.
Tangentially, if all identifiers, e.g. a, b, c, d, are immutable everywhere, state external to the program cannot be accessed (i.e. I/O), and there is no abstraction layer breakage, then functions are always pure.
By the way, Haskell has a different syntax, f (g a b) (h c d).
Evaluation order details
A function is a state transition (not a mutable stored value) from the input to the output. For RT compositions of calls to pure functions, the order-of-execution of these state transitions is independent. The state transition of each function call is independent of the others, due to lack of side-effects and the principle that an RT function may be replaced by its cached value. To correct a popular misconception, pure monadic composition is always declarative and RT, in spite of the fact that Haskell's IO monad is arguably impure and thus imperative w.r.t. the World state external to the program (but in the sense of the caveat below, the side-effects are isolated).
Eager evaluation means the functions arguments are evaluated before the function is called, and lazy evaluation means the arguments are not evaluated until (and if) they are accessed within the function.
Definition: function parameters are declared at the function definition site, and function arguments are supplied at the function call site. Know the difference between parameter and argument.
Conceptually, all expressions are (a composition of) function calls, e.g. constants are functions without inputs, unary operators are functions with one input, binary infix operators are functions with two inputs, constructors are functions, and even control statements (e.g. if, for, while) can be modeled with functions. The order that these argument functions (do not confuse with nested function call order) are evaluated is not declared by the syntax, e.g. f( g() ) could eagerly evaluate g then f on g's result or it could evaluate f and only lazily evaluate g when its result is needed within f.
Caveat, no Turing complete language (i.e. that allows unbounded recursion) is perfectly declarative, e.g. lazy evaluation introduces memory and time indeterminism. But these side-effects due to the choice of evaluation order are limited to memory consumption, execution time, latency, non-termination, and external hysteresis thus external synchronization.
Functional programming
Because declarative programming cannot have loops, then the only way to iterate is functional recursion. It is in this sense that functional programming is related to declarative programming.
But functional programming is not limited to declarative programming. Functional composition can be contrasted with subtyping, especially with respect to the Expression Problem, where extension can be achieved by either adding subtypes or functional decomposition. Extension can be a mix of both methodologies.
Functional programming usually makes the function a first-class object, meaning the function type can appear in the grammar anywhere any other type may. The upshot is that functions can input and operate on functions, thus providing for separation-of-concerns by emphasizing function composition, i.e. separating the dependencies among the subcomputations of a deterministic computation.
For example, instead of writing a separate function (and employing recursion instead of loops if the function must also be declarative) for each of an infinite number of possible specialized actions that could be applied to each element of a collection, functional programming employs reusable iteration functions, e.g. map, fold, filter. These iteration functions input a first-class specialized action function. These iteration functions iterate the collection and call the input specialized action function for each element. These action functions are more concise because they no longer need to contain the looping statements to iterate the collection.
However, note that if a function is not pure, then it is really a procedure. We can perhaps argue that functional programming that uses impure functions, is really procedural programming. Thus if we agree that declarative expressions are RT, then we can say that procedural programming is not declarative programming, and thus we might argue that functional programming is always RT and must be a subset of declarative programming.
Parallelism
This functional composition with first-class functions can express the depth in the parallelism by separating out the independent function.
Brent’s Principle: computation with work w and depth d can be
implemented in a p-processor PRAM in time O(max(w/p, d)).
Both concurrency and parallelism also require declarative programming, i.e. immutability and RT.
So where did this dangerous assumption that Parallelism == Concurrency
come from? It’s a natural consequence of languages with side-effects:
when your language has side-effects everywhere, then any time you try
to do more than one thing at a time you essentially have
non-determinism caused by the interleaving of the effects from each
operation. So in side-effecty languages, the only way to get
parallelism is concurrency; it’s therefore not surprising that we
often see the two conflated.
FP evaluation order
Note the evaluation order also impacts the termination and performance side-effects of functional composition.
Eager (CBV) and lazy (CBN) are categorical duels[10], because they have reversed evaluation order, i.e. whether the outer or inner functions respectively are evaluated first. Imagine an upside-down tree, then eager evaluates from function tree branch tips up the branch hierarchy to the top-level function trunk; whereas, lazy evaluates from the trunk down to the branch tips. Eager doesn't have conjunctive products ("and", a/k/a categorical "products") and lazy doesn't have disjunctive coproducts ("or", a/k/a categorical "sums")[11].
Performance
Eager
As with non-termination, eager is too eager with conjunctive functional composition, i.e. compositional control structure does unnecessary work that isn't done with lazy. For example, eager eagerly and unnecessarily maps the entire list to booleans, when it is composed with a fold that terminates on the first true element.
This unnecessary work is the cause of the claimed "up to" an extra log n factor in the sequential time complexity of eager versus lazy, both with pure functions. A solution is to use functors (e.g. lists) with lazy constructors (i.e. eager with optional lazy products), because with eager the eagerness incorrectness originates from the inner function. This is because products are constructive types, i.e. inductive types with an initial algebra on an initial fixpoint[11]
Lazy
As with non-termination, lazy is too lazy with disjunctive functional composition, i.e. coinductive finality can occur later than necessary, resulting in both unnecessary work and non-determinism of the lateness that isn't the case with eager[10][11]. Examples of finality are state, timing, non-termination, and runtime exceptions. These are imperative side-effects, but even in a pure declarative language (e.g. Haskell), there is state in the imperative IO monad (note: not all monads are imperative!) implicit in space allocation, and timing is state relative to the imperative real world. Using lazy even with optional eager coproducts leaks "laziness" into inner coproducts, because with lazy the laziness incorrectness originates from the outer function (see the example in the Non-termination section, where == is an outer binary operator function). This is because coproducts are bounded by finality, i.e. coinductive types with a final algebra on an final object[11].
Lazy causes indeterminism in the design and debugging of functions for latency and space, the debugging of which is probably beyond the capabilities of the majority of programmers, because of the dissonance between the declared function hierarchy and the runtime order-of-evaluation. Lazy pure functions evaluated with eager, could potentially introduce previously unseen non-termination at runtime. Conversely, eager pure functions evaluated with lazy, could potentially introduce previously unseen space and latency indeterminism at runtime.
Non-termination
At compile-time, due to the Halting problem and mutual recursion in a Turing complete language, functions can't generally be guaranteed to terminate.
Eager
With eager but not lazy, for the conjunction of Head "and" Tail, if either Head or Tail doesn't terminate, then respectively either List( Head(), Tail() ).tail == Tail() or List( Head(), Tail() ).head == Head() is not true because the left-side doesn't, and right-side does, terminate.
Whereas, with lazy both sides terminate. Thus eager is too eager with conjunctive products, and non-terminates (including runtime exceptions) in those cases where it isn't necessary.
Lazy
With lazy but not eager, for the disjunction of 1 "or" 2, if f doesn't terminate, then List( f ? 1 : 2, 3 ).tail == (f ? List( 1, 3 ) : List( 2, 3 )).tail is not true because the left-side terminates, and right-side doesn't.
Whereas, with eager neither side terminates so the equality test is never reached. Thus lazy is too lazy with disjunctive coproducts, and in those cases fails to terminate (including runtime exceptions) after doing more work than eager would have.
[10] Declarative Continuations and Categorical Duality, Filinski, sections 2.5.4 A comparison of CBV and CBN, and 3.6.1 CBV and CBN in the SCL.
[11] Declarative Continuations and Categorical Duality, Filinski, sections 2.2.1 Products and coproducts, 2.2.2 Terminal and initial objects, 2.5.2 CBV with lazy products, and 2.5.3 CBN with eager coproducts.
There's not really any non-ambiguous, objective definition for these. Here is how I would define them:
Imperative - The focus is on what steps the computer should take rather than what the computer will do (ex. C, C++, Java).
Declarative - The focus is on what the computer should do rather than how it should do it (ex. SQL).
Functional - a subset of declarative languages that has heavy focus on recursion
imperative and declarative describe two opposing styles of programming. imperative is the traditional "step by step recipe" approach while declarative is more "this is what i want, now you work out how to do it".
these two approaches occur throughout programming - even with the same language and the same program. generally the declarative approach is considered preferable, because it frees the programmer from having to specify so many details, while also having less chance for bugs (if you describe the result you want, and some well-tested automatic process can work backwards from that to define the steps then you might hope that things are more reliable than having to specify each step by hand).
on the other hand, an imperative approach gives you more low level control - it's the "micromanager approach" to programming. and that can allow the programmer to exploit knowledge about the problem to give a more efficient answer. so it's not unusual for some parts of a program to be written in a more declarative style, but for the speed-critical parts to be more imperative.
as you might imagine, the language you use to write a program affects how declarative you can be - a language that has built-in "smarts" for working out what to do given a description of the result is going to allow a much more declarative approach than one where the programmer needs to first add that kind of intelligence with imperative code before being able to build a more declarative layer on top. so, for example, a language like prolog is considered very declarative because it has, built-in, a process that searches for answers.
so far, you'll notice that i haven't mentioned functional programming. that's because it's a term whose meaning isn't immediately related to the other two. at its most simple, functional programming means that you use functions. in particular, that you use a language that supports functions as "first class values" - that means that not only can you write functions, but you can write functions that write functions (that write functions that...), and pass functions to functions. in short - that functions are as flexible and common as things like strings and numbers.
it might seem odd, then, that functional, imperative and declarative are often mentioned together. the reason for this is a consequence of taking the idea of functional programming "to the extreme". a function, in it's purest sense, is something from maths - a kind of "black box" that takes some input and always gives the same output. and that kind of behaviour doesn't require storing changing variables. so if you design a programming language whose aim is to implement a very pure, mathematically influenced idea of functional programming, you end up rejecting, largely, the idea of values that can change (in a certain, limited, technical sense).
and if you do that - if you limit how variables can change - then almost by accident you end up forcing the programmer to write programs that are more declarative, because a large part of imperative programming is describing how variables change, and you can no longer do that! so it turns out that functional programming - particularly, programming in a functional language - tends to give more declarative code.
to summarise, then:
imperative and declarative are two opposing styles of programming (the same names are used for programming languages that encourage those styles)
functional programming is a style of programming where functions become very important and, as a consequence, changing values become less important. the limited ability to specify changes in values forces a more declarative style.
so "functional programming" is often described as "declarative".
In a nutshell:
An imperative language specfies a series of instructions that the computer executes in sequence (do this, then do that).
A declarative language declares a set of rules about what outputs should result from which inputs (eg. if you have A, then the result is B). An engine will apply these rules to inputs, and give an output.
A functional language declares a set of mathematical/logical functions which define how input is translated to output. eg. f(y) = y * y. it is a type of declarative language.
Imperative: how to achieve our goal
Take the next customer from a list.
If the customer lives in Spain, show their details.
If there are more customers in the list, go to the beginning
Declarative: what we want to achieve
Show customer details of every customer living in Spain
Imperative Programming means any style of programming where your program is structured out of instructions describing how the operations performed by a computer will happen.
Declarative Programming means any style of programming where your program is a description either of the problem or the solution - but doesn't explicitly state how the work will be done.
Functional Programming is programming by evaluating functions and functions of functions... As (strictly defined) functional programming means programming by defining side-effect free mathematical functions so it is a form of declarative programming but it isn't the only kind of declarative programming.
Logic Programming (for example in Prolog) is another form of declarative programming. It involves computing by deciding whether a logical statement is true (or whether it can be satisfied). The program is typically a series of facts and rules - i.e. a description rather than a series of instructions.
Term Rewriting (for example CASL) is another form of declarative programming. It involves symbolic transformation of algebraic terms. It's completely distinct from logic programming and functional programming.
imperative - expressions describe sequence of actions to perform (associative)
declarative - expressions are declarations that contribute to behavior of program (associative, commutative, idempotent, monotonic)
functional - expressions have value as only effect; semantics support equational reasoning
Since I wrote my prior answer, I have formulated a new definition of the declarative property which is quoted below. I have also defined imperative programming as the dual property.
This definition is superior to the one I provided in my prior answer, because it is succinct and it is more general. But it may be more difficult to grok, because the implication of the incompleteness theorems applicable to programming and life in general are difficult for humans to wrap their mind around.
The quoted explanation of the definition discusses the role pure functional programming plays in declarative programming.
All exotic types of programming fit into the following taxonomy of declarative versus imperative, since the following definition claims they are duals.
Declarative vs. Imperative
The declarative property is weird, obtuse, and difficult to capture in a technically precise definition that remains general and not ambiguous, because it is a naive notion that we can declare the meaning (a.k.a semantics) of the program without incurring unintended side effects. There is an inherent tension between expression of meaning and avoidance of unintended effects, and this tension actually derives from the incompleteness theorems of programming and our universe.
It is oversimplification, technically imprecise, and often ambiguous to define declarative as “what to do” and imperative as “how to do”. An ambiguous case is the “what” is the “how” in a program that outputs a program— a compiler.
Evidently the unbounded recursion that makes a language Turing complete, is also analogously in the semantics— not only in the syntactical structure of evaluation (a.k.a. operational semantics). This is logically an example analogous to Gödel's theorem— “any complete system of axioms is also inconsistent”. Ponder the contradictory weirdness of that quote! It is also an example that demonstrates how the expression of semantics does not have a provable bound, thus we can't prove2 that a program (and analogously its semantics) halt a.k.a. the Halting theorem.
The incompleteness theorems derive from the fundamental nature of our universe, which as stated in the Second Law of Thermodynamics is “the entropy (a.k.a. the # of independent possibilities) is trending to maximum forever”. The coding and design of a program is never finished— it's alive!— because it attempts to address a real world need, and the semantics of the real world are always changing and trending to more possibilities. Humans never stop discovering new things (including errors in programs ;-).
To precisely and technically capture this aforementioned desired notion within this weird universe that has no edge (ponder that! there is no “outside” of our universe), requires a terse but deceptively-not-simple definition which will sound incorrect until it is explained deeply.
Definition:
The declarative property is where there can exist only one possible set of statements that can express each specific modular semantic.
The imperative property3 is the dual, where semantics are inconsistent under composition and/or can be expressed with variations of sets of statements.
This definition of declarative is distinctively local in semantic scope, meaning that it requires that a modular semantic maintain its consistent meaning regardless where and how it's instantiated and employed in global scope. Thus each declarative modular semantic should be intrinsically orthogonal to all possible others— and not an impossible (due to incompleteness theorems) global algorithm or model for witnessing consistency, which is also the point of “More Is Not Always Better” by Robert Harper, Professor of Computer Science at Carnegie Mellon University, one of the designers of Standard ML.
Examples of these modular declarative semantics include category theory functors e.g. the Applicative, nominal typing, namespaces, named fields, and w.r.t. to operational level of semantics then pure functional programming.
Thus well designed declarative languages can more clearly express meaning, albeit with some loss of generality in what can be expressed, yet a gain in what can be expressed with intrinsic consistency.
An example of the aforementioned definition is the set of formulas in the cells of a spreadsheet program— which are not expected to give the same meaning when moved to different column and row cells, i.e. cell identifiers changed. The cell identifiers are part of and not superfluous to the intended meaning. So each spreadsheet result is unique w.r.t. to the cell identifiers in a set of formulas. The consistent modular semantic in this case is use of cell identifiers as the input and output of pure functions for cells formulas (see below).
Hyper Text Markup Language a.k.a. HTML— the language for static web pages— is an example of a highly (but not perfectly3) declarative language that (at least before HTML 5) had no capability to express dynamic behavior. HTML is perhaps the easiest language to learn. For dynamic behavior, an imperative scripting language such as JavaScript was usually combined with HTML. HTML without JavaScript fits the declarative definition because each nominal type (i.e. the tags) maintains its consistent meaning under composition within the rules of the syntax.
A competing definition for declarative is the commutative and idempotent properties of the semantic statements, i.e. that statements can be reordered and duplicated without changing the meaning. For example, statements assigning values to named fields can be reordered and duplicated without changed the meaning of the program, if those names are modular w.r.t. to any implied order. Names sometimes imply an order, e.g. cell identifiers include their column and row position— moving a total on spreadsheet changes its meaning. Otherwise, these properties implicitly require global consistency of semantics. It is generally impossible to design the semantics of statements so they remain consistent if randomly ordered or duplicated, because order and duplication are intrinsic to semantics. For example, the statements “Foo exists” (or construction) and “Foo does not exist” (and destruction). If one considers random inconsistency endemical of the intended semantics, then one accepts this definition as general enough for the declarative property. In essence this definition is vacuous as a generalized definition because it attempts to make consistency orthogonal to semantics, i.e. to defy the fact that the universe of semantics is dynamically unbounded and can't be captured in a global coherence paradigm.
Requiring the commutative and idempotent properties for the (structural evaluation order of the) lower-level operational semantics converts operational semantics to a declarative localized modular semantic, e.g. pure functional programming (including recursion instead of imperative loops). Then the operational order of the implementation details do not impact (i.e. spread globally into) the consistency of the higher-level semantics. For example, the order of evaluation of (and theoretically also the duplication of) the spreadsheet formulas doesn't matter because the outputs are not copied to the inputs until after all outputs have been computed, i.e. analogous to pure functions.
C, Java, C++, C#, PHP, and JavaScript aren't particularly declarative.
Copute's syntax and Python's syntax are more declaratively coupled to
intended results, i.e. consistent syntactical semantics that eliminate the extraneous so one can readily
comprehend code after they've forgotten it. Copute and Haskell enforce
determinism of the operational semantics and encourage “don't repeat
yourself” (DRY), because they only allow the pure functional paradigm.
2 Even where we can prove the semantics of a program, e.g. with the language Coq, this is limited to the semantics that are expressed in the typing, and typing can never capture all of the semantics of a program— not even for languages that are not Turing complete, e.g. with HTML+CSS it is possible to express inconsistent combinations which thus have undefined semantics.
3 Many explanations incorrectly claim that only imperative programming has syntactically ordered statements. I clarified this confusion between imperative and functional programming. For example, the order of HTML statements does not reduce the consistency of their meaning.
Edit: I posted the following comment to Robert Harper's blog:
in functional programming ... the range of variation of a variable is a type
Depending on how one distinguishes functional from imperative
programming, your ‘assignable’ in an imperative program also may have
a type placing a bound on its variability.
The only non-muddled definition I currently appreciate for functional
programming is a) functions as first-class objects and types, b)
preference for recursion over loops, and/or c) pure functions— i.e.
those functions which do not impact the desired semantics of the
program when memoized (thus perfectly pure functional
programming doesn't exist in a general purpose denotational semantics
due to impacts of operational semantics, e.g. memory
allocation).
The idempotent property of a pure function means the function call on
its variables can be substituted by its value, which is not generally
the case for the arguments of an imperative procedure. Pure functions
seem to be declarative w.r.t. to the uncomposed state transitions
between the input and result types.
But the composition of pure functions does not maintain any such
consistency, because it is possible to model a side-effect (global
state) imperative process in a pure functional programming language,
e.g. Haskell's IOMonad and moreover it is entirely impossible to
prevent doing such in any Turing complete pure functional programming
language.
As I wrote in 2012 which seems to the similar consensus of
comments in your recent blog, that declarative programming is an
attempt to capture the notion that the intended semantics are never
opaque. Examples of opaque semantics are dependence on order,
dependence on erasure of higher-level semantics at the operational
semantics layer (e.g. casts are not conversions and reified generics
limit higher-level semantics), and dependence on variable values
which can not be checked (proved correct) by the programming language.
Thus I have concluded that only non-Turing complete languages can be
declarative.
Thus one unambiguous and distinct attribute of a declarative language
could be that its output can be proven to obey some enumerable set of
generative rules. For example, for any specific HTML program (ignoring
differences in the ways interpreters diverge) that is not scripted
(i.e. is not Turing complete) then its output variability can be
enumerable. Or more succinctly an HTML program is a pure function of
its variability. Ditto a spreadsheet program is a pure function of its
input variables.
So it seems to me that declarative languages are the antithesis of
unbounded recursion, i.e. per Gödel's second incompleteness
theorem self-referential theorems can't be proven.
Lesie Lamport wrote a fairytale about how Euclid might have
worked around Gödel's incompleteness theorems applied to math proofs
in the programming language context by to congruence between types and
logic (Curry-Howard correspondence, etc).
Imperative programming: telling the “machine” how to do something, and as a result what you want to happen will happen.
Declarative programming: telling the “machine” what you would like to happen, and letting the computer figure out how to do it.
Example of imperative
function makeWidget(options) {
const element = document.createElement('div');
element.style.backgroundColor = options.bgColor;
element.style.width = options.width;
element.style.height = options.height;
element.textContent = options.txt;
return element;
}
Example of declarative
function makeWidget(type, txt) {
return new Element(type, txt);
}
Note: The difference is not one of brevity or complexity or abstraction. As stated, the difference is how vs what.
I think that your taxonomy is incorrect. There are two opposite types imperative and declarative. Functional is just a subtype of declarative. BTW, wikipedia states the same fact.
Nowadays, new focus: we need the old classifications?
The Imperative/Declarative/Functional aspects was good in the past to classify generic languages, but in nowadays all "big language" (as Java, Python, Javascript, etc.) have some option (typically frameworks) to express with "other focus" than its main one (usual imperative), and to express parallel processes, declarative functions, lambdas, etc.
So a good variant of this question is "What aspect is good to classify frameworks today?"
... An important aspect is something that we can labeling "programming style"...
Focus on the fusion of data with algorithm
A good example to explain. As you can read about jQuery at Wikipedia,
The set of jQuery core features — DOM element selections, traversal and manipulation —, enabled by its selector engine (...), created a new "programming style", fusing algorithms and DOM-data-structures
So jQuery is the best (popular) example of focusing on a "new programming style", that is not only object orientation, is "Fusing algorithms and data-structures". jQuery is somewhat reactive as spreadsheets, but not "cell-oriented", is "DOM-node oriented"... Comparing the main styles in this context:
No fusion: in all "big languages", in any Functional/Declarative/Imperative expression, the usual is "no fusion" of data and algorithm, except by some object-orientation, that is a fusion in strict algebric structure point of view.
Some fusion: all classic strategies of fusion, in nowadays have some framework using it as paradigm... dataflow, Event-driven programming (or old domain specific languages as awk and XSLT)... Like programming with modern spreadsheets, they are also examples of reactive programming style.
Big fusion: is "the jQuery style"... jQuery is a domain specific language focusing on "fusing algorithms and DOM-data-structures". PS: other "query languages", as XQuery, SQL (with PL as imperative expression option) are also data-algorith-fusion examples, but they are islands, with no fusion with other system modules... Spring, when using find()-variants and Specification clauses, is another good fusion example.
Declarative programming is programming by expressing some timeless logic between the input and the output, for instance, in pseudocode, the following example would be declarative:
def factorial(n):
if n < 2:
return 1
else:
return factorial(n-1)
output = factorial(argvec[0])
We just define a relationship called the 'factorial' here, and defined the relationship between the output and the input as the that relationship. As should be evident here, about any structured language allows declarative programming to some extend. A central idea of declarative programming is immutable data, if you assign to a variable, you only do so once, and then never again. Other, stricter definitions entail that there may be no side-effects at all, these languages are some times called 'purely declarative'.
The same result in an imperative style would be:
a = 1
b = argvec[0]
while(b < 2):
a * b--
output = a
In this example, we expressed no timeless static logical relationship between the input and the output, we changed memory addresses manually until one of them held the desired result. It should be evident that all languages allow declarative semantics to some extend, but not all allow imperative, some 'purely' declarative languages permit side effects and mutation altogether.
Declarative languages are often said to specify 'what must be done', as opposed to 'how to do it', I think that is a misnomer, declarative programs still specify how one must get from input to output, but in another way, the relationship you specify must be effectively computable (important term, look it up if you don't know it). Another approach is nondeterministic programming, that really just specifies what conditions a result much meet, before your implementation just goes to exhaust all paths on trial and error until it succeeds.
Purely declarative languages include Haskell and Pure Prolog. A sliding scale from one and to the other would be: Pure Prolog, Haskell, OCaml, Scheme/Lisp, Python, Javascript, C--, Perl, PHP, C++, Pascall, C, Fortran, Assembly
Some good answers here regarding the noted "types".
I submit some additional, more "exotic" concepts often associated with the functional programming crowd:
Domain Specific Language or DSL Programming: creating a new language to deal with the problem at hand.
Meta-Programming: when your program writes other programs.
Evolutionary Programming: where you build a system that continually improves itself or generates successively better generations of sub-programs.
In a nutshell, the more a programming style emphasizes What (to do) abstracting away the details of How (to do it) the more that style is considered to be declarative. The opposite is true for imperative. Functional programming is associated with the declarative style.