Concrete (code-) examples for benefits of dynamic programming languages - language-agnostic

I am currently working on the design of a controlled experiment where I
hope to measure a benefit of dynamically typed programming languages
compared to statically typed ones.
I am not looking for another "which one is better"- debate here, as there
are enough discussions on this topic (e.g.
Dynamic type languages versus static type languages
or
What do people find so appealing about dynamic languages?).
I also asked the same question on LtU, which ended up in another
discussion. :-)
All these discussions mention some good points, but nearly all are missing
some concrete code(!)-examples which proves their point.
So my question is: Can anyone of you give me some example code which
directly reveals some benefits of a dynamically typed language or shows a
situation where a static type system is more an obstacle than a helpful
tool?
Thanks in advance.

by the definition: duck typing. but you want more specific example so, let's say: building objects. in statically typed languages you need to declare type or use built in methods for runtime code generation. in dynamic languages you can just build objects ad hoc. for example to create a mock in groovy you can just write the one method you need to mock and pass it as an object: http://groovy.codehaus.org/Developer+Testing+using+Closures+instead+of+Mocks
in static languages people write big and complex libraries to make it easier. and still it's verbose and often has limitations

There was a time, not long ago, where these were the kind of choices we had to make: Do I want it to be fast and easy to write the program, but sometimes there are more runtime bugs? Or do I want to be protected from more of those bugs at compiletime, but at the cost of having to waste more time writing more boilerplate?
That's over now. Dynamic languages were extremely exciting back when they were the only way to avoid adding a lot more bureaucracy to your code, but dynamic typing is now being obsoleted by type inference. So you're actually right -- dynamic typing doesn't add value anymore, now that more powerful strong type systems are catching on.
Huh? Where? How?
Languages like Haskell, the MLs, Scala, Rust, and some of the .NET family combine strong typing with type inference so you get the best of both: everything has a strong, enforced type, but you don't have to actually write it down unless the compiler can't figure it out, which it usually can.
Don't ignore them for being academic. Things have changed. Those kinds of languages with powerful typing systems have been steadily getting much more popular over the last few years. (2) And, I'm finding it more and more rare to hear about a brand new language these days that doesn't sport a combination of strong typing for safety with type inference for convenience, at least to some degree. Today's brand new esoteric languages are next decade's "enterprise standards" so I think everyone will be using this best-of-both-worlds approach eventually.
Even old man Java is moving slowly in that direction! (examples: diamond operator; inferring the functional interface for lambda expressions)
Learning dynamic languages for the first time now is too late, it's like if throughout the 1990s you were putting off replacing your tape deck with a CD player and then finally around 2002 you get around to buying a CD player.
EDIT: I reread your question. I sympathize with your desire for concrete code. Here's a Rosetta stone. A generic procedure to check whether a list is sorted or not, and code that calls it. The code that calls it does the right thing every day and the wrong thing on leap days, as an example of how compiletime errors win big over runtime errors when you have a bug in rarely-executed code. Code is untested, just a sketch. In particular, I'm still learning Haskell (hence the zeal of the newly converted... :P) and I haven't used Python as heavily as I used to for a couple years, so forgive
Traditional static typing:
// Java 7 - strong typing but no type inference
// Notice that you get safety, because the wrong usage is caught at compile time
// But you have to type a lot :-(
static interface Comparable { boolean compare(Comparable other); }
...
boolean isListSorted( List<T extends Comparable> xs ) {
T prev = null; for( T x: xs ) {
if( prev!=null && !prev.compare(xs) ) return false;
prev = x;
}
return true;
}
...
public final class String implement Comparable { ... }
public final class Animal /* definitely does not implement Comparable! :-) */ { ... }
...
// proper usage
List<String> names = Lists.newArrayListOf("Red", "Spinelli", "Ankh", "Morporkh", "Sam");
boolean areTheNamesSorted = isListSorted(names);
if(todayIsALeapDay()) {
// bad usage -- the compiler catches it, so you don't have to worry about this happening
// in production when a user enters data that send your app down a rarely-touched codepath
List<Animal> pets = Lists.newArrayListOf( Animal.getCat(), Animal.getDog(), Animal.getBird() );
boolean areTheNamesSorted = isListSorted(pets);
}
Dynamic typing:
class Animal:
...
# Python 2.6 -- dynamic typing
# notice how little keystrokes are needed
# but the bug is now
def isListSorted(xs):
# to keep it more similar to the Java7 version, zip is avoied
prev = None
for x in xs:
if prev is not None and not prev.compare(x): return False
prev = x
return True
...
# usage -- beautiful, isn't it?
names = ["Raph", "Argh", "Marge"]
areNamesSorted = isListSorted(names)
if isLeapDayToday():
# ...but here comes trouble in paradise!
animals = [Animal.getCat(), Animal.getDog()]
# raises a runtime exception. I hope your unit tests catch it, but unlike type
# systems it's hard to make absolutely sure you'll be alerted if you forget a test
# besides, then you've just traded the Evil of Having to Write Lots of Type Signatures
# for the Evil of Having to Write Lots of Unit Tests. What kind of terrible deal is that?
areAnimalsSorted = isListSorted(animals)
Powerful typing:
-- Haskell - full safety of strong typing, but easy on the fingers and eyes
class Comparable a where
compare :: a -> a -> Bool
end
instance Comparable String where
...
end
data Animal = ...
isListSorted [] = True
isListSorted x:[] = True
isListSorted x1:x2:xs = (compare x1 x2) && (isListSorted xs)
names = ["Raplph", "Argh", "Blarge"]
areNamesSorted = isListSorted names
animals = [Cat, Dog, Parrot]
-- compile time error because [Animal] is not compatible with Comparable a => [a] since
-- Animal is not an instance of comparable
areAnimalsSorted = isListSorted animals
Verbose powerful typing:
-- Same Haskell program as before, but we explicitly write down the types
-- Just for educational purposes. Like comments, you can omit them if you don't think
-- the reader needs them because it's obvious. Unlike comments, the compiler verifies their truth!
class Comparable a where
compare :: a -> a -> Bool
end
instance Comparable String where
...
end
data Animal = Cat | Dog | Parrot
isListSorted :: Ord a => [a] -> Bool
isListSorted [] = True
isListSorted x:[] = True
isListSorted x1:x2:xs = (compare x1 x2) && (isListSorted xs)
names :: [String]
names = ["Raplph", "Argh", "Blarge"]
areNamesSorted = isListSorted names
-- compile time error because [Animal] is not compatible with Comparable a => [a] since
-- Animal is not an instance of comparable
animals :: [Animal]
animals = [Cat, Dog, Parrot]
areAnimalsSorted = isListSorted animals

Related

Value polymorphism and "generating an exception"

Per The Definition of Standard ML (Revised):
The idea is that dynamic evaluation of a non-expansive expression will neither generate an exception nor extend the domain of the memory, while the evaluation of an expansive expression might.
[§4.7, p19; emphasis mine]
I've found a lot of information online about the ref-cell part, but almost none about the exception part. (A few sources point out that it's still possible for a polymorphic binding to raise Bind, and that this inconsistency can have type-theoretic and/or implementation consequences, but I'm not sure whether that's related.)
I've been able to come up with one exception-related unsoundness that, if I'm not mistaken, is prevented only by the value restriction; but that unsoundness does not depend on raising an exception:
local
val (wrapAnyValueInExn, unwrapExnToAnyType) =
let exception EXN of 'a
in (EXN, fn EXN value => value)
end
in
val castAnyValueToAnyType = fn value => unwrapExnToAnyType (wrapAnyValueInExn value)
end
So, can anyone tell me what the Definition is getting at, and why it mentions exceptions?
(Is it possible that "generate an exception" means generating an exception name, rather than generating an exception packet?)
I'm not a type theorist or formal semanticist, but I think I understand what the definition is trying to get at from an operational point of view.
ML exceptions being generative means that, whenever the control of flow reaches the same exception declaration twice, two different exceptions are created. Not only are these distinct objects in memory, but these objects are also extensionally unequal: we can distinguish these objects by pattern-matching against exceptions constructors.
[Incidentally, this shows an important difference between ML exceptions and exceptions in most other languages. In ML, new exception classes can be created at runtime.]
On the other hand, if your program builds the same list of integers twice, you may have two different objects in memory, but your program has no way to distinguish between them. They are extensionally equal.
As an example of why generative exceptions are useful, consider MLton's sample implementation of a universal type:
signature UNIV =
sig
type univ
val embed : unit -> { inject : 'a -> univ
, project : univ -> 'a option
}
end
structure Univ :> UNIV =
struct
type univ = exn
fun 'a embed () =
let
exception E of 'a
in
{ inject = E
, project = fn (E x) => SOME x | _ => NONE
}
end
end
This code would cause a huge type safety hole if ML had no value restriction:
val { inject = inj1, project = proj1 } = Univ.embed ()
val { inject = inj2, project = proj2 } = Univ.embed ()
(* `inj1` and `proj1` share the same internal exception. This is
* why `proj1` can project values injected with `inj1`.
*
* `inj2` and `proj2` similarly share the same internal exception.
* But this exception is different from the one used by `inj1` and
* `proj1`.
*
* Furthermore, the value restriction makes all of these functions
* monomorphic. However, at this point, we don't know yet what these
* monomorphic types might be.
*)
val univ1 = inj1 "hello"
val univ2 = inj2 5
(* Now we do know:
*
* inj1 : string -> Univ.univ
* proj1 : Univ.univ -> string option
* inj2 : int -> Univ.univ
* proj2 : Univ.univ -> int option
*)
val NONE = proj1 univ2
val NONE = proj2 univ1
(* Which confirms that exceptions are generative. *)
val SOME str = proj1 univ1
val SOME int = proj2 univ2
(* Without the value restriction, `str` and `int` would both
* have type `'a`, which is obviously unsound. Thanks to the
* value restriction, they have types `string` and `int`,
* respectively.
*)
[Hat-tip to Eduardo León's answer for stating that the Definition is indeed referring to this, and for bringing in the phrase "generative exceptions". I've upvoted his answer, but am posting this separately, because I felt that his answer came at the question from the wrong direction, somewhat: most of that answer is an exposition of things that are already presupposed by the question.]
Is it possible that "generate an exception" means generating an exception name, rather than generating an exception packet?
Yes, I think so. Although the Definition doesn't usually use the word "exception" alone, other sources do commonly refer to exception names as simply "exceptions" — including in the specific context of generating them. For example, from http://mlton.org/GenerativeException:
In Standard ML, exception declarations are said to be generative, because each time an exception declaration is evaluated, it yields a new exception.
(And as you can see there, that page consistently refers to exception names as "exceptions".)
The Standard ML Basis Library, likewise, uses "exception" in this way. For example, from page 29:
At one extreme, a programmer could employ the standard exception General.Fail everywhere, letting it carry a string describing the particular failure. […] For example, one technique is to have a function sampleFn in a structure Sample raise the exception Fail "Sample.sampleFn".
As you can see, this paragraph uses the term "exception" twice, once in reference to an exception name, and once in reference to an exception value, relying on context to make the meaning clear.
So it's quite reasonable for the Definition to use the phrase "generate an exception" to refer to generating an exception name (though even so, it is probably a small mistake; the Definition is usually more precise and formal than this, and usually indicates when it intends to rely on context for disambiguation).

what I do for these conditions to follow FP?

I'm reading FP and I have two basic questions:
FP says function should take one input and gives single output. So what should I do with void methods? It doesn't return anything right?
FP says function should have single
resresponsibility, then how do we handle log statements inside the method? That doesn't violate the rule?
Wish to know how they handle these things in Scala, Haskell.
Thanks in advance.
I'm assuming you're reading a book called "Functional Programming", although it would help to know who the author is as well. In any case, these questions are relatively easy to answer and I'll give my answers with respect to Haskell because I don't know Scala.
So what should I do with void methods? It doesn't return anything right?
There are no void methods in a pure functional language like Haskell. A pure function has no side effects, so a pure function without a return value is meaningless, something like
f :: Int -> ()
f x = let y = x * x + 3 in ()
won't do any computation, y is never calculated and all inputs you give will return the same value. However, if you have an impure function, such as one that writes a file or prints something to the screen then it must exist in a monadic context. If you don't understand monads yet, don't worry. They take a bit to get used to, but they're a very powerful and useful abstraction that can make a lot of problems easier. A monad is something like IO, and in Haskell this takes a type parameter to indicate the value that can be stored inside this context. So you can have something like
putStrLn :: String -> IO ()
Or
-- FYI: FilePath is an alias for String
writeFile :: FilePath -> String -> IO ()
these have side effects, denoted by the return value of IO something, and the () something means that there is no meaningful result from that operation. In Python 3, for example, the print function returns None because there isn't anything meaningful to return after printing a value to the screen. The () can also mean that a monadic context has a meaningful value, such as in readFile or getLine:
getLine :: IO String
readFile :: FilePath -> IO String
When writing your main function, you could do something like
main = do
putStrLn "Enter a filename:"
fname <- getLine -- fname has type String
writeFile fname "This text will be in a file"
contents <- readFile fname
putStrLn "I wrote the following text to the file:"
putStrLn contents
FP says function should have single resresponsibility, then how do we handle log statements inside the method? That doesn't violate the rule?
Most functions don't need logging inside them. I know that sounds weird, but it's true. In Haskell and most other functional languages, you'll write a lot of small, easily testable functions that each do one step. It's very common to have lots of 1 or 2 line functions in your application.
When you actually do need to do logging, say you're building a web server, there are a couple different approaches you can take. There is actually a monad out there called Writer that lets you aggregate values as you perform operations. These operations don't have to be impure and do IO, they can be entirely pure. However, a true logging framework that one might use for a web server or large application would likely come with its own framework. This is so that you can set up logging to the screen, to files, network locations, email, and more. This monad will wrap the IO monad so that it can perform these side effects. A more advanced one would probably use some more advanced libraries like monad transformers or extensible effects. These let you "combine" different monads together so you can use utilities for both at the same time. You might see code like
type MyApp a = LogT IO a
-- log :: Monad m => LogLevel -> String -> LogT m ()
getConnection :: Socket -> MyApp Connection
getConnection sock = do
log DEBUG "Waiting for next connection"
conn <- liftIO $ acceptConnection sock
log INFO $ "Accepted connection from IP: " ++ show (connectionIP conn)
return conn
I'm not expecting you to understand this code fully, but I hope you can see that it has logging and network operations mixed together. The liftIO function is a common one with monad transformers that "transforms" an IO operation into a new monad that wraps IO.
This may sound pretty confusing, and it can be at first if you're used to Python, Java, or C++ like languages. I certainly was! But after I got used to thinking about problems in this different way makes me wish I had these constructs in OOP languages all the time.
I can answer from Haskell perspective.
FP says function should take one input and gives single output. So what should I do with void methods? It doesn't return anything right?
Because that's what actually functions are! In mathematics, every functions takes some input and gives you some output. You cannot expect some output without giving any input. void methods you see in other languages doesn't make sense in a mathematical way. But in reality void methods in other languages do some kind of IO operations, which is abstracted as IO monad in Haskell.
how do we handle log statements inside the method
You can use a monad transformer stack and lift your IO log operations to perform there. In fact, writer monad can do log operations purely without any IO activities.

Are there disadvantages to return type inference? If yes, what are they?

A lot of statically typed languages, like C++ and C#, have local variable type inference (with the keywords auto and var respectively, I think).
However, I haven't seen many C-derived languages (apart from those mentioned in the comments) implementing compile-time return type inference. I'll describe what I mean by "return type inference" before I ask the question. (I definitely don't mean overloading by return type.)
Consider this code in a hypothetical C#-like language:
private auto SomeMethod(int x)
{
return 3 * x;
}
It's more than obvious (to humans and to the compiler) that the return type is int (and the compilers can verify it).
The same goes for multiple paths:
private auto SomeOtherMethod(int x)
{
if(x == 0) return 1;
else return 3 * x;
}
It's still not ambiguous at all, because there is already an algorithm in said languages to resolve whether two expressions have compatible types:
private auto YetAnotherMethod(int x)
{
var r = (x == 0) ? 1 : 3 * x;
return r;
}
Since the algorithm exists and it is already implemented in some form, it's probably not a technical problem in this regard. But still, I haven't seen it anywhere in statically typed languages, which got me thinking about whether there's something bad about it.
My question:
Does return type inference, as a concept, have any disadvantage or subtle pitfall that I'm not seeing? (Apart from readability - I already understand that.)
Is there some corner case where it would introduce problems or ambiguity to a statically typed language? (By "introduce", I'm referring to issues that local variable type inference doesn't already have.)
yes, there are disadvantages. one you already mentioned: readability. second - the type has to be calculated so it takes time (in turing-complete type systems it may be infinite). but there is also something different - theory of type systems is much more complicated.
let's write a function that takes a list and return its head. what's its type? or function that takes a function, and a parameter applies that and return the result. in many languages you can't declare it. to support this kind of stuff, java introduced generics and it failed miserably. currently it's one of the most hated features of the language because of consistency problems
another thing: returned type may depend on not only the body of the function but also context of the invocation. let's look at haskell (that has best type system i've ever seen) http://learnyouahaskell.com/types-and-typeclasses
there is a function called read that takes a string, parse it and return... whatever you need, an int, an array.
so each time a type system is designed, the designer has to choose at which level she wants to stop. dynamic languages decided not to infer types at all, scala decided to do some local inference but not, for example, for overloaded or recursive functions and c++ decided not to infer the result

Is there any advantage in specifying types of variables and return type of functions?

I always set the types of my variables and functions, a habit I brought from my Java learning, seems the right thing to do.
But I always see "weak typing" in other people's code, but I can't disagree with that as I don't know what are the real advantages of keep everything strong typed.
I think my question is clear, but I gonna give some examples:
var id = "Z226";
function changeId(newId){
id = newId;
return newId;
}
My code would be like this:
var id:String = "Z226";
function changeId(newId:String):String{
id = newId;
return newId;
}
Yes, the big advantanges are:
faster code execution, because the runtime know the type, it does not have to evaluate the call
better tool support: auto completion and code hints will work with typed arguments and return types
far better readability
You get performance benefits from strongly typing. See http://gskinner.com/talks/quick/#45
I also find strongly typed code to be much more readable, but I guess depending on the person they may not care.
As pointed out by florian, two advantages of strongly typing are that development tools can can use the information to provide better code-hinting and code-completion, and that type, as an explicit indicator of how the variable or method is intended to be used, can make the code much easier to understand.
The question of performance seems to be up for debate. However, this answer on stackoverflow suggests that typed is definitely faster than untyped in certain benchmark tests but, as the author states, not so much that you would notice it under normal conditions.
However, I would argue that the biggest advantage of strong typing is that you get a compiler error if you attempt to assign or return a value of the wrong type. This helps prevent the kind of pernicious bug which can only be tracked down by actually running the program.
Consider the following contrived example in which ActionScript automatically converts the result to a string before returning. Strongly typing the method's parameter and return will ensure that the program will not compile and a warning is issued. This could potentially save you hours of debugging.
function increment(value) {
return value + 1;
}
trace(increment("1"));
// 11
While the points in the other answers about code hinting and error checking are accurate, I want to address the claim about performance. It's really not all that true. In theory, strong type allows the compiler to generate code that's closer to native. With the current VM though, such optimization doesn't happen. Here and there the AS3 compiler will employ an integer instruction instead of a floating point one. Otherwise the type indicators don't have much effect at runtime.
For example, consider the following code:
function hello():String {
return "Hello";
}
var s:String = hello() + ' world';
trace(s);
Here're the AVM2 op codes resulting from it:
getlocal_0
pushscope
getlocal_0
getlocal_0
callproperty 4 0 ; call hello()
pushstring 12 ; push ' world' onto stack
add ; concatenate the two
initproperty 5 ; save it to var s
findpropstrict 7 ; look up trace
getlocal_0 ; push this onto stack
getproperty 5 ; look up var s
callpropvoid 7 1 ; call trace
returnvoid
Now, if I remove the type indicators, I get the following:
getlocal_0
pushscope
getlocal_0
getlocal_0
callproperty 3 0
pushstring 11
add
initproperty 4
findpropstrict 6
getlocal_0
getproperty 4
callpropvoid 6 1
returnvoid
It's exactly the same, except all the name indices are one less since 'String' no longer appears in the constant table.
I'm not trying to discourage people from employing strong typing. One just shouldn't expect miracle on the performance front.
EDIT: In case anyone is interested, I've put my AS3 bytecode disassembler online:
http://flaczki.net46.net/codedump/
I've improved it so that it now dereferences the operands.

Emulating namespaces in Fortran 90

One of the most troublesome issues with Fortran 90 is the lack of namespacing. In this previous question "How do you use Fortran 90 module data" from Pete, it has been discussed the main issue of USE behaving like a "from module import *" in Python: everything that is declared public in the module is imported as-is within the scope of the importing module. No prefixing. This makes very, very hard to understand, while reading some code, where a given identifier comes from, and if a given module is still used or not.
A possible solution, discussed in the question I linked above, is to use the ONLY keyword to both limit the imported identifiers and document where they come from, although this is very, very tedious when the module is very large. Keeping the module small, and always using USE : ONLY is a potentially good strategy to work around the lack of namespacing and qualifying prefixes in Fortran 9X.
Are there other (not necessarily better) workaround strategies? Does the Fortran 2k3 standard say anything regarding namespacing support?
For me this is the most irritating Fortran feature related to modules. The only solution is to add common prefix to procedures, variables, constants, etc. to avoid namespace collisions.
One can prefix all entities (all public entities seems to be more appropriate) right inside the module:
module constants
implicit none
real, parameter :: constants_pi = 3.14
real, parameter :: constants_e = 2.71828183
end module constants
Drawback is increased code verbosity inside the module. As an alternative one can use namespace-prefix wrapper module as suggested here, for example.
module constants_internal
implicit none
real, parameter :: pi = 3.14
real, parameter :: e = 2.71828183
end module constants_internal
module constants
use constants_internal, only: &
constants_pi => pi, &
constants_e => e
end module constants
The last is a small modification of what you, Stefano, suggested.
Even if we accept the situation with verbosity the fact that Fortran is not case-sensitive language force us to use the same separator (_) in entities names. And it will be really difficult to distinguish module name (as a prefix) from entity name until we do not use strong naming discipline, for example, module names are one word only.
Having several years of Fortran-only programming experience (I got into Python only a year ago), I was not aware of such concept as namespaces for a while. So I guess I learned to just keep track of everything imported, and as High Performance Mark said, use ONLY as much as you have time to do it (tedious).
Another way I can think of to emulate a namespace would be to declare everything within a module as a derived type component. Fortran won't let you name the module the same way as the namespace, but prefixing module_ to module name could be intuitive enough:
MODULE module_constants
IMPLICIT NONE
TYPE constants_namespace
REAL :: pi=3.14159
REAL :: e=2.71828
ENDTYPE
TYPE(constants_namespace) :: constants
ENDMODULE module_constants
PROGRAM namespaces
USE module_constants
IMPLICIT NONE
WRITE(*,*)constants%pi
WRITE(*,*)constants%e
ENDPROGRAM namespaces
Fortran 2003 has the new ASSOCIATE construct and don't forget the possibility of renaming USE- associated entities. But I don't think that either of these is much closer to providing a good emulation of namespaces than Fortran 90 already has, just (slightly) better workarounds.
Like some of the respondents to the question you link to, I tend to think that modules with very many identifiers should probably be split into smaller modules (or, wait for Fortran 2008 and use submodules) and these days I almost always specify an ONLY clause (with renames) for USE statements.
I can't say that I miss namespaces much, but then I've never had them really.
No one else summited this suggestion as an answer (though someone did in the comments of one answer). So I'm going to submit this in hopes it may help someone else.
You can emulate namespaces in the following way, and I would hope there would be no noticeable performance hit for doing so from the compiler (if so all object oriented Fortran programming is sufferings). I use this pattern in my production code. The only real downfall to me is the lack of derived type contained parameter variables, but I offer an option for that as well below.
Module Math_M
IMPLICIT NONE
PRIVATE
public :: Math
Type Math_T
real :: pi=3.14159
contains
procedure, nopass :: e => math_e
procedure :: calcAreaOfCircle => math_calcAreaOfCircle
End Type
Type(Math_T) :: Math
real, parameter :: m_e = 2.71828
contains
function math_e() result(e)
real :: e
e = m_e
end function math_e
function math_calcAreaOfCircle(this, r) result(a)
class(Math_T), intent(in) :: this
real, intent(in) :: r
real :: a
a = this%pi * r**2.0
end function math_calcAreaOfCircle
End Module Math_M
And the usage
Program Main
use Math_M
IMPLICIT NONE
print *, Math%pi
print *, Math%e()
print *, Math%calcAreaOfCircle(2.0)
End Program Main
Personally I prefer using $ over the _ for module variables, but not all compilers like that without compiler flags. Hopefully this helps someone in the future.