Do you have to declare a function's type? - function

One thing I do not fully understand about Haskell is declaring functions and their types: is it something you have to do or is it just something you should do for clarity? Or are there certain scenarios where you need to do it, just not all?

You don’t need to declare the type of any function that uses only standard Haskell type system features. Haskell 98 is specified with global type inference, meaning that the types of all top-level bindings are guaranteed to be inferable.
However, it’s good practice to include type annotations for top-level definitions, for a few reasons:
Verifying that the inferred type matches your expectations
Helping the compiler producing better diagnostic messages when there are type mismatches
Most importantly, documenting your intent and making the code more readable for humans!
As for definitions in where clauses, it’s a matter of style. The conventional style is to omit them, partly because in some cases, their types could not be written explicitly before the ScopedTypeVariables extension. I consider the omission of scoped type variables a bit of a bug in the 1998 and 2010 standards, and GHC is the de facto standard compiler today, but it’s still a nonstandard extension. Regardless, it’s good practice to include annotations where possible for nontrivial code, and helpful for you as a programmer.
In practice, it’s common to use some language extensions that complicate type inference or make it “undecidable”, meaning that, at least for arbitrary programs, it’s impossible to always infer a type, or at least a unique “best” type. But for the sake of usability, extensions are usually designed very carefully to only require annotations at the point where you actually use them.
For example, GHC (and standard Haskell) will only infer polymorphic types with top-level foralls, which are normally left entirely implicit. (They can be written explicitly using ExplicitForAll.) If you need to pass a polymorphic function as an argument to another function like (forall t. …) -> … using RankNTypes, this requires an annotation to override the compiler’s assumption that you meant something like forall t. (… -> …), or that you mistakenly applied the function on different types.
If an extension requires annotations, the rules for when and where you must include them are typically documented in places like the GHC User’s Guide, and formally specified in the papers specifying the feature.

Short answer: Functions are defined in "bindings" and have their types declared in "type signatures". Type signatures for bindings are always syntactically optional, as the language doesn't require their use in any particular case. (There are some places type signatures are required for things other than bindings, like in class definitions or in declarations of data types, but I don't think there's any case where a binding requires an accompanying type signature according to the syntax of the language, though I might be forgetting some weird situation.) The reason they aren't required is that the compiler can usually, though not always, figure out the types of functions itself as part of its type-checking operation.
However, some programs may not compile unless a type signature is added to a binding, and some programs may not compile unless a type signature is removed, so sometimes you need to use them, and sometimes you can't use them (at least not without a language extension and some changes to the syntax of other, nearby type signatures to use the extension).
It is considered best practice to include type signatures for every top-level binding, and the GHC -Wall flag will warn you if any top-level bindings lack an associated type signature. The rationale for this is that top-level signatures (1) provide documentation for the "interface" of your code, (2) aren't so numerous that they overburden the programmer, and (3) usually provide sufficient guidance to the compiler that you get better error messages than if you omit type signatures entirely.
If you look at almost any real-world Haskell source code (e.g., browse the source of any decent library on Hackage), you'll see this convention being used -- all top-level bindings have type signatures, and type signatures are used sparingly in other contexts (in expressions or where or let clauses). I'd encourage any beginner to use this same convention in the code they write as they're learning Haskell. It's a good habit and will avoid many frustrating error messages.
Long answer:
In Haskell, a binding assigns a name to a chunk of code, like the following binding for the function hypo:
hypo a b = sqrt (a*a + b*b)
When compiling a binding (or collection of related bindings), the compiler performs a type-checking operation on the expressions and subexpressions that are involved.
It is this type-checking operation that allows the compiler to determine that the variable a in the above expression must be of some type t that has a Num t constraint (in order to support the * operation), that the result of a*a will be the same type t, and that this implies that b*b and so b are also of this same type t (since only two values of the same type can be added together with +), and that a*a + b*b is therefore of type t, and so the result of the sqrt must also be of this same type t which must incidentally have a Floating t constraint to support the sqrt operation. The information collected and type relationships deduced during this type checking allow the compiler to infer a general type signature for the hypo function automatically, namely:
hypo :: (Floating t) => t -> t -> t
(The Num t constraint doesn't appear because it's implied by Floating t).
Because the compiler can learn the type signatures of (most) bound names, like hypo, automatically as a side-effect of the type-checking operation, there's no fundamental need for the programmer to explicitly supply this information, and that's the motivation for the language making type signatures optional. The only requirements the language places on type signatures is that if they are supplied, they must appear in the same declaration list as the associated binding (e.g., both must appear in the same module, or in the same where clause or whatever, and you can't have a type signature without a binding), there must be at most one type signature for a binding (no duplicate type signatures, even if they are identical, unlike in C, say), and the type supplied in the type signature must not be in conflict with the results of type checking.
The language allows the type signature and binding to appear anywhere in the same declaration list, in any order, and with no requirement they be next to each other, so the following is valid Haskell code:
double :: (Num a) => a -> a
half x = x / 2
double x = x + x
half :: (Fractional a) => a -> a
Such silliness is not recommended, however, and the convention is to place the type signature immediately before the corresponding binding, though one exception is to have a type signature shared across multiple bindings of the same type, whose definitions follow:
ex1, ex2, ex3 :: Tree Int
ex1 = Leaf 1
ex2 = Node (Leaf 2) (Leaf 3)
ex3 = Node (Node (Leaf 4) (Leaf 5)) (Leaf 5)
In some situations, the compiler cannot automatically infer the correct type for a binding, and a type signature may be required. The following binding requires a type signature and won't compile without it. (The technical problem is that toList is written using polymorphic recursion.)
data Binary a = Leaf a | Pair (Binary (a,a)) deriving (Show)
-- following type signature is required...
toList :: Binary a -> [a]
toList (Leaf x) = [x]
toList (Pair b) = concatMap (\(x,y) -> [x,y]) (toList b)
In other situations, the compiler can automatically infer the correct type for a binding, but the type can't be expressed in a type signature (at least, not without some GHC extensions to the standard language). This happens most often in where clauses. (The technical problem is that type variables aren't scoped, and go's type involves the type variable a from the type signature of myLookup.)
myLookup :: Eq a => a -> [(a,b)] -> Maybe b
myLookup k = go
where -- go :: [(a,b)] -> Maybe b
go ((k',v):rest) | k == k' = Just v
| otherwise = go rest
go [] = Nothing
There's no type signature in standard Haskell for go that would work here. However, if you enable an extension, you can write one if you also modify the type signature for myLookup itself to scope the type variables.
myLookup :: forall a b. Eq a => a -> [(a,b)] -> Maybe b
myLookup k = go
where go :: [(a,b)] -> Maybe b
go ((k',v):rest) | k == k' = Just v
| otherwise = go rest
go [] = Nothing
It's considered best practice to put type signatures on all top-level bindings and use them sparingly elsewhere. The -Wall compiler flag turns on the -Wmissing-signatures warning which warns about any missing top-level signatures.
The main motivation, I think, is that top-level bindings are the ones that are most likely to be used in multiple places throughout the code at some distance from where they are defined, and the type signature usually provides concise documentation for what a function does and how it's intended to be used. Consider the following type signatures from a Sudoku solver I wrote many years ago. Is there much doubt what these functions do?
possibleSymbols :: Index -> Board -> [Symbol]
possibleBoards :: Index -> Board -> [Board]
setSymbol :: Index -> Board -> Symbol -> Board
While the type signatures auto-generated by the compiler also serve as decent documentation and can be inspected in GHCi, it's convenient to have the type signatures in the source code, as a form of compiler-checked comment documenting the binding's purpose.
Any Haskell programmer who's spent a moment trying to use an unfamiliar library, read someone else's code, or read their own past code knows how helpful top-level signatures are as documentation. (Admittedly, a frequently levelled criticism of Haskell is that sometimes the type signatures are the only documentation for a library.)
A secondary motivation is that in developing and refactoring code, type signatures make it easier to "control" the types and localize errors. Without any signatures, the compiler can infer some really crazy types for code, and the error messages that get generated can be baffling, often identifying parts of the code that have nothing to do with the underlying error.
For example, consider this program:
data Tree a = Leaf a | Node (Tree a) (Tree a)
leaves (Leaf x) = x
leaves (Node l r) = leaves l ++ leaves r
hasLeaf x t = elem x (leaves t)
main = do
-- some tests
print $ hasLeaf 1 (Leaf 1)
print $ hasLeaf 1 (Node (Leaf 2) (Leaf 3))
The functions leaves and hasLeaf compile fine, but main barfs out the following cascade of errors (abbreviated for this posting):
Leaves.hs:12:11-28: error:
• Ambiguous type variable ‘a0’ arising from a use of ‘hasLeaf’
prevents the constraint ‘(Eq a0)’ from being solved.
Probable fix: use a type annotation to specify what ‘a0’ should be.
Leaves.hs:12:19: error:
• Ambiguous type variable ‘a0’ arising from the literal ‘1’
prevents the constraint ‘(Num a0)’ from being solved.
Probable fix: use a type annotation to specify what ‘a0’ should be.
Leaves.hs:12:27: error:
• No instance for (Num [a0]) arising from the literal ‘1’
Leaves.hs:13:11-44: error:
• Ambiguous type variable ‘a1’ arising from a use of ‘hasLeaf’
prevents the constraint ‘(Eq a1)’ from being solved.
Probable fix: use a type annotation to specify what ‘a1’ should be.
Leaves.hs:13:19: error:
• Ambiguous type variable ‘a1’ arising from the literal ‘1’
prevents the constraint ‘(Num a1)’ from being solved.
Probable fix: use a type annotation to specify what ‘a1’ should be.
Leaves.hs:13:33: error:
• No instance for (Num [a1]) arising from the literal ‘2’
With programmer-supplied top-level type signatures:
leaves :: Tree a -> [a]
leaves (Leaf x) = x
leaves (Node l r) = leaves l ++ leaves r
hasLeaf :: (Eq a) => a -> Tree a -> Bool
hasLeaf x t = elem x (leaves t)
a single error is immediately localized to the offending line:
leaves (Leaf x) = x
^
Leaves.hs:4:19: error:
• Occurs check: cannot construct the infinite type: a ~ [a]
Beginners might not understand the "occurs check" but are at least looking at the right place to make the simple fix:
leaves (Leaf x) = [x]
So, why not add type signatures everywhere, not just at top-level? Well, if you literally tried to add type signatures everywhere they were syntactically valid, you'd be writing code like:
{-# LANGUAGE ScopedTypeVariables #-}
hypo :: forall t. (Floating t) => t -> t -> t
hypo (a :: t) (b :: t) = sqrt (((a :: t) * (a :: t) :: t) + ((b :: t) * (b :: t) :: t) :: t) :: t
so you want to draw the line somewhere. The main argument against adding them for all bindings in let and where clauses is that those bindings are often short bindings easily understood at a glance, and they're localized to the code that you're trying to understand "all at once" anyway. The signatures are also potentially less useful as documentation because bindings in these clauses are more likely to refer to and use other nearby bindings of arguments or intermediate results, so they aren't "self-contained" like a top-level binding. The signature only documents a small portion of what the binding is doing. For example, in:
qsort :: (Ord a) => [a] -> [a]
qsort (x:xs) = qsort l ++ [x] ++ qsort r
where -- l, r :: [a]
l = filter (<=x) xs
r = filter (>x) xs
qsort [] = []
having type signatures l, r :: [a] in the where clause wouldn't add very much. There's also the additional complication that you'd need the ScopedTypeVariables extension to write it, as above, so that's maybe another reason to omit it.
As I say, I think any Haskell beginner should be encouraged to adopt a similar convention of writing top-level type signatures, ideally writing the top-level signature before starting to write the accompanying bindings. It's one of the easiest ways to leverage the type system to guide the design process and write good Haskell code.

Related

Introductory F# (Fibonacci and function expressions)

I've started a course on introduction to F#, and I've been having some trouble with two assignments. The first one had me creating two functions, where the first function takes an input and adds it with four, and the second one calculates sqrt(x^2+y^2). Then I should write function expressions for them both, but for some reason it gives me the error "Unexpected symbol'|' in implementation file".
let g = fun n -> n + 4;;
let h = fun (x,y) -> System.Math.Sqrt((x*x)+(y*y));;
let f = fun (x,n) -> float
|(n,0) -> g(n)
|(x,n) -> h(x,n);;
The second assignment asks me to create a function, which finds the sequence of Fibonaccis numbers. I've written the following code, but it seems to forget about the 0 in the beginning since the output always is n+1 and not n.
let rec fib = function
|0 -> 0
|1 -> 1
|n -> fib(n-1) + fib(n-2)
;;
Keep in mind that this is the first week, so I should be able to create these with those methods.
Your first snippet mostly suffers from two issues:
In F#, there is a difference between float and int. You write integer values as 4 or 0 and you write float values as 4.0 or 0.0. F# does not automatically convert integers to floats, so you need to be consistent.
Your syntax in the f function is a bit odd - I'm not sure what float is supposed to mean there and the fun and function constructs behave differently.
So, starting with your original code:
let g = fun n -> n + 4;;
This works, but I would not write it as an explicit function using fun - you can use let to define functions too and it is simpler. Also, you only need ;; in F# Interactive, but if you're using any decent editor with command for sending code to F# interactive (via Alt+Enter) you do not need that.
However, in your f function, you want to return float so you need to modify g to return float too. This means replacing 4 with 4.0:
let g n = n + 4.0
The h function is good, but you can again write it using let:
let h (x,y) = System.Math.Sqrt((x*x)+(y*y));;
In your f function, you can either use function to write a function using pattern matching, or you can use more verbose syntax using match (function is just a shorthand for writing a function and then pattern matching on the input):
let f = function
| (n,0.0) -> g(n)
| (x,n) -> h(x,n)
let f (x, y) =
match (x, y) with
| (n,0.0) -> g(n)
| (x,n) -> h(x,n)
Also note that the indentation matters - you need spaces before |.
I'm going to address your first block of code, and leave the Fibonacci function for later. First I'll repost your code, then I'll talk about it.
let g = fun n -> n + 4;;
let h = fun (x,y) -> System.Math.Sqrt((x*x)+(y*y));;
let f = fun (x,n) -> float
|(n,0) -> g(n)
|(x,n) -> h(x,n);;
First comment: If you're defining a function and assigning it immediately to a name, like in all these examples, you don't need the fun keyword. The usual way to define functions is to write them as let (name) (parameters) = (function body). So your code above would become:
let g n = n + 4;;
let h (x,y) = System.Math.Sqrt((x*x)+(y*y));;
let f (x,n) = float
|(n,0) -> g(n)
|(x,n) -> h(x,n);;
I haven't made any other changes, so your f function still has an error in it. Let's address that error next.
I think the mistake you're making here is to think that fun and function are interchangeable. They're not. fun is standard function definition, but function is something else. It's a very common pattern in F# to write functions like the following:
let someFunc parameter =
match parameter with
| "case 1" -> printfn "Do something"
| "case 2" -> printfn "Do something else"
| _ -> printfn "Default behavior"
The function keyword is shorthand for one parameter plus a match expression. In other words, this:
let someFunc = function
| "case 1" -> printfn "Do something"
| "case 2" -> printfn "Do something else"
| _ -> printfn "Default behavior"
is exactly the same code as this:
let someFunc parameter =
match parameter with
| "case 1" -> printfn "Do something"
| "case 2" -> printfn "Do something else"
| _ -> printfn "Default behavior"
with just one difference. In the version with the function keyword, you don't get to pick the name of the parameter. It gets automatically created by the F# compiler, and since you can't know in advance what the name of the parameter will be, you can't refer to it in your code. (Well, there are ways, but I don't want to make you learn to run before you have learned to walk, so to speak). And one more thing: while you're still learning F#, I strongly recommend that you do NOT use the function keyword. It's really useful once you know what you're doing, but in your early learning stages you should use the more explicit match (parameter) with expressions. That way you'll get used to seeing what it's doing. Once you've been doing F# for a few months, then you can start replacing those let f param = match param with (...) expressions with the shorter let f = function (...). But until match param with (...) has really sunk in and you've understood it, you should continue to type it out explicitly.
So your f function should have looked like:
let f (x,n) =
match (x,n) with
|(n,0) -> g(n)
|(x,n) -> h(x,n);;
I see that while I was typing this, Tomas Petricek posted a response, and it addresses the incorrect usage of float, so I won't duplicate his explanation of why you're going to get an error on the word float in your f function. And he also explained about ;;, so I won't duplicate that either. I'll just say that when he mentions "any decent editor with command for sending code to F# interactive (via Alt+Enter)", there are a lot of editor choices -- but as a beginner, you might just want someone to recommend one to you, so I'll recommend one. First off, though: if you're on Windows, you might be using Visual Studio already, in which case you should stick to Visual Studio since you know it. It's a good editor for F#. But if you don't use Visual Studio yet, I don't recommend downloading it just to play around with F#. It's a beast of a program, designed for professional software developers to do all sorts of things they need to do in their jobs, and so it can feel a bit overwhelming if you're just getting started. So I would actually recommend something more lightweight: the editor called Visual Studio Code. It's cross-platform, and will run perfectly well on Linux, OS X or on Windows. Once you've downloaded and installed VS Code, you'll then want to install the Ionide extension. Ionide is a plugin for VS Code (and also for Atom, though the Atom version of Ionide is updated less often since all the Ionide developers use VS Code now) that makes F# editing a real pleasure. There are actually three extensions you'll find: Ionide-fsharp, Ionide-FAKE, and Ionide-Paket. Download and install all three: FAKE and Paket are two tools for F# programming that you might not need yet, but once you do need them, you'll already have them installed.
Okay, that's enough to get you started, I think.

Value polymorphism and "generating an exception"

Per The Definition of Standard ML (Revised):
The idea is that dynamic evaluation of a non-expansive expression will neither generate an exception nor extend the domain of the memory, while the evaluation of an expansive expression might.
[§4.7, p19; emphasis mine]
I've found a lot of information online about the ref-cell part, but almost none about the exception part. (A few sources point out that it's still possible for a polymorphic binding to raise Bind, and that this inconsistency can have type-theoretic and/or implementation consequences, but I'm not sure whether that's related.)
I've been able to come up with one exception-related unsoundness that, if I'm not mistaken, is prevented only by the value restriction; but that unsoundness does not depend on raising an exception:
local
val (wrapAnyValueInExn, unwrapExnToAnyType) =
let exception EXN of 'a
in (EXN, fn EXN value => value)
end
in
val castAnyValueToAnyType = fn value => unwrapExnToAnyType (wrapAnyValueInExn value)
end
So, can anyone tell me what the Definition is getting at, and why it mentions exceptions?
(Is it possible that "generate an exception" means generating an exception name, rather than generating an exception packet?)
I'm not a type theorist or formal semanticist, but I think I understand what the definition is trying to get at from an operational point of view.
ML exceptions being generative means that, whenever the control of flow reaches the same exception declaration twice, two different exceptions are created. Not only are these distinct objects in memory, but these objects are also extensionally unequal: we can distinguish these objects by pattern-matching against exceptions constructors.
[Incidentally, this shows an important difference between ML exceptions and exceptions in most other languages. In ML, new exception classes can be created at runtime.]
On the other hand, if your program builds the same list of integers twice, you may have two different objects in memory, but your program has no way to distinguish between them. They are extensionally equal.
As an example of why generative exceptions are useful, consider MLton's sample implementation of a universal type:
signature UNIV =
sig
type univ
val embed : unit -> { inject : 'a -> univ
, project : univ -> 'a option
}
end
structure Univ :> UNIV =
struct
type univ = exn
fun 'a embed () =
let
exception E of 'a
in
{ inject = E
, project = fn (E x) => SOME x | _ => NONE
}
end
end
This code would cause a huge type safety hole if ML had no value restriction:
val { inject = inj1, project = proj1 } = Univ.embed ()
val { inject = inj2, project = proj2 } = Univ.embed ()
(* `inj1` and `proj1` share the same internal exception. This is
* why `proj1` can project values injected with `inj1`.
*
* `inj2` and `proj2` similarly share the same internal exception.
* But this exception is different from the one used by `inj1` and
* `proj1`.
*
* Furthermore, the value restriction makes all of these functions
* monomorphic. However, at this point, we don't know yet what these
* monomorphic types might be.
*)
val univ1 = inj1 "hello"
val univ2 = inj2 5
(* Now we do know:
*
* inj1 : string -> Univ.univ
* proj1 : Univ.univ -> string option
* inj2 : int -> Univ.univ
* proj2 : Univ.univ -> int option
*)
val NONE = proj1 univ2
val NONE = proj2 univ1
(* Which confirms that exceptions are generative. *)
val SOME str = proj1 univ1
val SOME int = proj2 univ2
(* Without the value restriction, `str` and `int` would both
* have type `'a`, which is obviously unsound. Thanks to the
* value restriction, they have types `string` and `int`,
* respectively.
*)
[Hat-tip to Eduardo León's answer for stating that the Definition is indeed referring to this, and for bringing in the phrase "generative exceptions". I've upvoted his answer, but am posting this separately, because I felt that his answer came at the question from the wrong direction, somewhat: most of that answer is an exposition of things that are already presupposed by the question.]
Is it possible that "generate an exception" means generating an exception name, rather than generating an exception packet?
Yes, I think so. Although the Definition doesn't usually use the word "exception" alone, other sources do commonly refer to exception names as simply "exceptions" — including in the specific context of generating them. For example, from http://mlton.org/GenerativeException:
In Standard ML, exception declarations are said to be generative, because each time an exception declaration is evaluated, it yields a new exception.
(And as you can see there, that page consistently refers to exception names as "exceptions".)
The Standard ML Basis Library, likewise, uses "exception" in this way. For example, from page 29:
At one extreme, a programmer could employ the standard exception General.Fail everywhere, letting it carry a string describing the particular failure. […] For example, one technique is to have a function sampleFn in a structure Sample raise the exception Fail "Sample.sampleFn".
As you can see, this paragraph uses the term "exception" twice, once in reference to an exception name, and once in reference to an exception value, relying on context to make the meaning clear.
So it's quite reasonable for the Definition to use the phrase "generate an exception" to refer to generating an exception name (though even so, it is probably a small mistake; the Definition is usually more precise and formal than this, and usually indicates when it intends to rely on context for disambiguation).

Haskell function definition convention

I am beginner in Haskell .
The convention used in function definition as per my school material is actually as follows
function_name arguments_separated_by_spaces = code_to_do
ex :
f a b c = a * b +c
As a mathematics student I am habituated to use the functions like as follows
function_name(arguments_separated_by_commas) = code_to_do
ex :
f(a,b,c) = a * b + c
Its working in Haskell .
My doubt is whether it works in all cases ?
I mean can i use traditional mathematical convention in Haskell function definition also ?
If wrong , in which specific cases the convention goes wrong ?
Thanks in advance :)
Let's say you want to define a function that computes the square of the hypoteneuse of a right-triangle. Either of the following definitions are valid
hyp1 a b = a * a + b * b
hyp2(a,b) = a * a + b * b
However, they are not the same function! You can tell by looking at their types in GHCI
>> :type hyp1
hyp1 :: Num a => a -> a -> a
>> :type hyp2
hyp2 :: Num a => (a, a) -> a
Taking hyp2 first (and ignoring the Num a => part for now) the type tells you that the function takes a pair (a, a) and returns another a (e.g it might take a pair of integers and return another integer, or a pair of real numbers and return another real number). You use it like this
>> hyp2 (3,4)
25
Notice that the parentheses aren't optional here! They ensure that the argument is of the correct type, a pair of as. If you don't include them, you will get an error (which will probably look really confusing to you now, but rest assured that it will make sense when you've learned about type classes).
Now looking at hyp1 one way to read the type a -> a -> a is it takes two things of type a and returns something else of type a. You use it like this
>> hyp1 3 4
25
Now you will get an error if you do include parentheses!
So the first thing to notice is that the way you use the function has to match the way you defined it. If you define the function with parens, you have to use parens every time you call it. If you don't use parens when you define the function, you can't use them when you call it.
So it seems like there's no reason to prefer one over the other - it's just a matter of taste. But actually I think there is a good reason to prefer one over the other, and you should prefer the style without parentheses. There are three good reasons:
It looks cleaner and makes your code easier to read if you don't have parens cluttering up the page.
You will take a performance hit if you use parens everywhere, because you need to construct and deconstruct a pair every time you use the function (although the compiler may optimize this away - I'm not sure).
You want to get the benefits of currying, aka partially applied functions*.
The last point is a little subtle. Recall that I said that one way to understand a function of type a -> a -> a is that it takes two things of type a, and returns another a. But there's another way to read that type, which is a -> (a -> a). That means exactly the same thing, since the -> operator is right-associative in Haskell. The interpretation is that the function takes a single a, and returns a function of type a -> a. This allows you to just provide the first argument to the function, and apply the second argument later, for example
>> let f = hyp1 3
>> f 4
25
This is practically useful in a wide variety of situations. For example, the map functions lets you apply some function to every element of a list -
>> :type map
map :: (a -> b) -> [a] -> [b]
Say you have the function (++ "!") which adds a bang to any String. But you have lists of Strings and you'd like them all to end with a bang. No problem! You just partially apply the map function
>> let bang = map (++ "!")
Now bang is a function of type**
>> :type bang
bang :: [String] -> [String]
and you can use it like this
>> bang ["Ready", "Set", "Go"]
["Ready!", "Set!", "Go!"]
Pretty useful!
I hope I've convinced you that the convention used in your school's educational material has some pretty solid reasons for being used. As someone with a math background myself, I can see the appeal of using the more 'traditional' syntax but I hope that as you advance in your programming journey, you'll be able to see the advantages in changing to something that's initially a bit unfamiliar to you.
* Note for pedants - I know that currying and partial application are not exactly the same thing.
** Actually GHCI will tell you the type is bang :: [[Char]] -> [[Char]] but since String is a synonym for [Char] these mean the same thing.
f(a,b,c) = a * b + c
The key difference to understand is that the above function takes a triple and gives the result. What you are actually doing is pattern matching on a triple. The type of the above function is something like this:
(a, a, a) -> a
If you write functions like this:
f a b c = a * b + c
You get automatic curry in the function.
You can write things like this let b = f 3 2 and it will typecheck but the same thing will not work with your initial version. Also, things like currying can help a lot while composing various functions using (.) which again cannot be achieved with the former style unless you are trying to compose triples.
Mathematical notation is not consistent. If all functions were given arguments using (,), you would have to write (+)((*)(a,b),c) to pass a*b and c to function + - of course, a*b is worked out by passing a and b to function *.
It is possible to write everything in tupled form, but it is much harder to define composition. Whereas now you can specify a type a->b to cover for functions of any arity (therefore, you can define composition as a function of type (b->c)->(a->b)->(a->c)), it is much trickier to define functions of arbitrary arity using tuples (now a->b would only mean a function of one argument; you can no longer compose a function of many arguments with a function of many arguments). So, technically possible, but it would need a language feature to make it simple and convenient.

Why encode function in data type definition?

I find it hard to get the intuition about encoding function in data type definition. This is done in the definition of the State and IO types, for e.g.
data State s a = State s -> (a,s)
type IO a = RealWorld -> (a, RealWorld) -- this is type synonym though, not new type
I would like to see a more trivial example to understand its value so I could possibly build on this to have more complex examples. For e.g. say I have a data structure, would that make any sense to encode a function in one of the data constructor.
data Tree = Node Int (Tree) (Tree) (? -> ?) | E
I am not sure what I am trying to do here, but what could be an example of a function that I can encode in such a type? And why would I have to encode it in the type, but not use it as a normal function, I don't know, maybe passed as argument when needed?
Really, functions are just data like anything else.
Prelude> :i (->)
data (->) a b -- Defined in`GHC.Prim'
instance Monad ((->) r) -- Defined in`GHC.Base'
instance Functor ((->) r) -- Defined in`GHC.Base'
This comes out very naturally and without anything conceptually surprising if you consider only functions from, say, Int. I'll give them a strange name: (remember that (->) a b means a->b)
type Array = (->) Int
What? Well, what's the most important operation on an array?
Prelude> :t (Data.Array.!)
(Data.Array.!) :: GHC.Arr.Ix i => GHC.Arr.Array i e -> i -> e
Prelude> :t (Data.Vector.!)
(Data.Vector.!) :: Data.Vector.Vector a -> Int -> a
Let's define something like that for our own array type:
(!) :: Array a -> Int -> a
(!) = ($)
Now we can do
test :: Array String
test 0 = "bla"
test 1 = "foo"
FnArray> test ! 0
"bla"
FnArray> test ! 1
"foo"
FnArray> test ! 2
"*** Exception: :8:5-34: Non-exhaustive patterns in function test
Compare this to
Prelude Data.Vector> let test = fromList ["bla", "foo"]
Prelude Data.Vector> test ! 0
"bla"
Prelude Data.Vector> test ! 1
"foo"
Prelude Data.Vector> test ! 2
"*** Exception: ./Data/Vector/Generic.hs:244 ((!)): index out of bounds (2,2)
Not all that different, right? It's Haskell's enforcement of referential transparency that guarantees us the return values of a function can actually be interpreted as inhabitant values of some container. This is one common way to look at the Functor instance: fmap transform f applies some transformation to the values "included" in f (as result values). This works by simply composing the transformation after the target function:
instance Functor (r ->) where
fmap transform f x = transform $ f x
(though you'd of course better write this simply fmap = (.).)
Now, what's a bit more confusing is that the (->) type constructor has one more type argument: the argument type. Let's focus on that by defining
{-# LANGUAGE TypeOperators #-}
newtype (:<-) a b = BackFunc (b->a)
To get some feel for it:
show' :: Show a => String :<- a
show' = BackFunc show
i.e. it's really just function arrows written the other way around.
Is (:<-) Int some sort of container, similarly to how (->) Int resembles an array? Not quite. We can't define instance Functor (a :<-). Yet, mathematically speaking, (a :<-) is a functor, but of a different kind: a contravariant functor.
instance Contravariant (a :<-) where
contramap transform (BackFunc f) = BackFunc $ f . transform
"Ordinary" functors OTOH are covariant functors. The naming is rather easy to understand if you compare directly:
fmap :: Functor f => (a->b) -> f a->f b
contramap :: Contravariant f => (b->a) -> f a->f b
While contravariant functors aren't nearly as commonly used as covariant ones, you can use them in much the same way when reasoning about data flow etc.. When using functions in data fields, it's really covariant vs. contravariant you should foremostly think about, not functions vs. values – because really, there is nothing special about functions compared to "static values" in a purely functional language.
About your Tree type
I don't think this data type could be made something really useful, but we can do something stupid with a similar type that may illustrate the points I made above:
data Tree' = Node Int (Bool -> Tree) | E
That is, disconsidering performance, isomorphic to the usual
data Tree = Node Int Tree Tree | E
Why? Well, Bool -> Tree is similar to Array Tree, except we don't use Ints for indexing but Bools. And there are only two evaluatable boolean values. Arrays with fixed size 2 are usually called tuples. And with Bool->Tree ≅ (Tree, Tree) we have Node Int (Bool->Tree) ≅ Node Int Tree Tree.
Admittedly this isn't all that interesting. With functions from a fixed domain the isomorphism are usually obvious. The interesting cases are polymorphic on the function domain and/or codomain, which always leads to somewhat abstract results such as the state monad. But even in those cases, you can remember that nothing really seperates functions from other data types in Haskell.
You generally start FP learning with 2 concepts - data types and functions. Once you have good confidence level of designing programs using these 2 concepts I would suggest you start using only 1 concept i.e of types which means:
You define new types by combining the existing types or type constructors in the language.
You define new type constructors to abstract out a general concept in your problem domain.
Function is a just a type which maps a particular type to another type. Which basically means that the types which the functions maps could themselves be functions and so on (because we just said that functions are type). This is what people generally call higher oreder functions and also this gives you the illusion that a function takes multiple parameters, whereas reality is that a function type always map a type to another type (i.e it is a unary function), but we know that the another type can itself be a function type.
Example : add :: Int -> Int -> Int is same as add :: Int -> (Int -> Int). add is (function) type which maps an Integer to a (function) type which maps an Integer to an Integer.
To create a Function type we use the (->) type constructor provided by Haskell.
Thinking in terms of above points you will find that the line between data types and functions is no more there.
As far as which type to choose is concerned, it solely depends on the problem domain you are trying to solve. Basically, when ever there is a need where you find that you need some sort of mapping from one type to another, you will use the (->) type.
The State is defined using function type because the way we represent state in FP is "a mapping which takes current state and returns a value and new state", as you can see that there is a mapping happening here and hence the use of (->) type.
Let's see if this helps. Unfortunately for beginners, the definition of State quotes State both on the left and right hand side, but they have different meaning: one is the name of the type, the other is the name of the constructor. So the definition is really:
data State s a = St (s -> (a,s))
Which means you can construct a value of type State s a using constructor St and passing it a function from s to (a,s), that is, a function that can construct a value of some type a and a value of next state s from the previous state. This is a simple way to represent a state transition.
In order to see why passing a function is useful, you need to study how the rest of it works. For example, we can construct new value of type State s a given two other values by composing the functions. By composing such States, such state transition functions, you get a state machine, which then can be used to compute a value and final state, given an initial state.
runStateMachine :: State s a -> s -> (a,s)
runStateMachine (St f) x = f x -- or shorter, runStateMachine (St f) = f -- just unwrap the function from the constructor

What is an existential type?

I read through the Wikipedia article Existential types. I gathered that they're called existential types because of the existential operator (∃). I'm not sure what the point of it is, though. What's the difference between
T = ∃X { X a; int f(X); }
and
T = ∀x { X a; int f(X); }
?
When someone defines a universal type ∀X they're saying: You can plug in whatever type you want, I don't need to know anything about the type to do my job, I'll only refer to it opaquely as X.
When someone defines an existential type ∃X they're saying: I'll use whatever type I want here; you won't know anything about the type, so you can only refer to it opaquely as X.
Universal types let you write things like:
void copy<T>(List<T> source, List<T> dest) {
...
}
The copy function has no idea what T will actually be, but it doesn't need to know.
Existential types would let you write things like:
interface VirtualMachine<B> {
B compile(String source);
void run(B bytecode);
}
// Now, if you had a list of VMs you wanted to run on the same input:
void runAllCompilers(List<∃B:VirtualMachine<B>> vms, String source) {
for (∃B:VirtualMachine<B> vm : vms) {
B bytecode = vm.compile(source);
vm.run(bytecode);
}
}
Each virtual machine implementation in the list can have a different bytecode type. The runAllCompilers function has no idea what the bytecode type is, but it doesn't need to; all it does is relay the bytecode from VirtualMachine.compile to VirtualMachine.run.
Java type wildcards (ex: List<?>) are a very limited form of existential types.
Update: Forgot to mention that you can sort of simulate existential types with universal types. First, wrap your universal type to hide the type parameter. Second, invert control (this effectively swaps the "you" and "I" part in the definitions above, which is the primary difference between existentials and universals).
// A wrapper that hides the type parameter 'B'
interface VMWrapper {
void unwrap(VMHandler handler);
}
// A callback (control inversion)
interface VMHandler {
<B> void handle(VirtualMachine<B> vm);
}
Now, we can have the VMWrapper call our own VMHandler which has a universally-typed handle function. The net effect is the same, our code has to treat B as opaque.
void runWithAll(List<VMWrapper> vms, final String input)
{
for (VMWrapper vm : vms) {
vm.unwrap(new VMHandler() {
public <B> void handle(VirtualMachine<B> vm) {
B bytecode = vm.compile(input);
vm.run(bytecode);
}
});
}
}
An example VM implementation:
class MyVM implements VirtualMachine<byte[]>, VMWrapper {
public byte[] compile(String input) {
return null; // TODO: somehow compile the input
}
public void run(byte[] bytecode) {
// TODO: Somehow evaluate 'bytecode'
}
public void unwrap(VMHandler handler) {
handler.handle(this);
}
}
A value of an existential type like ∃x. F(x) is a pair containing some type x and a value of the type F(x). Whereas a value of a polymorphic type like ∀x. F(x) is a function that takes some type x and produces a value of type F(x). In both cases, the type closes over some type constructor F.
Note that this view mixes types and values. The existential proof is one type and one value. The universal proof is an entire family of values indexed by type (or a mapping from types to values).
So the difference between the two types you specified is as follows:
T = ∃X { X a; int f(X); }
This means: A value of type T contains a type called X, a value a:X, and a function f:X->int. A producer of values of type T gets to choose any type for X and a consumer can't know anything about X. Except that there's one example of it called a and that this value can be turned into an int by giving it to f. In other words, a value of type T knows how to produce an int somehow. Well, we could eliminate the intermediate type X and just say:
T = int
The universally quantified one is a little different.
T = ∀X { X a; int f(X); }
This means: A value of type T can be given any type X, and it will produce a value a:X, and a function f:X->int no matter what X is. In other words: a consumer of values of type T can choose any type for X. And a producer of values of type T can't know anything at all about X, but it has to be able to produce a value a for any choice of X, and be able to turn such a value into an int.
Obviously implementing this type is impossible, because there is no program that can produce a value of every imaginable type. Unless you allow absurdities like null or bottoms.
Since an existential is a pair, an existential argument can be converted to a universal one via currying.
(∃b. F(b)) -> Int
is the same as:
∀b. (F(b) -> Int)
The former is a rank-2 existential. This leads to the following useful property:
Every existentially quantified type of rank n+1 is a universally quantified type of rank n.
There is a standard algorithm for turning existentials into universals, called Skolemization.
I think it makes sense to explain existential types together with universal types, since the two concepts are complementary, i.e. one is the "opposite" of the other.
I cannot answer every detail about existential types (such as giving an exact definition, list all possible uses, their relation to abstract data types, etc.) because I'm simply not knowledgeable enough for that. I'll demonstrate only (using Java) what this HaskellWiki article states to be the principal effect of existential types:
Existential types can be used for several different purposes. But what they do is to 'hide' a type variable on the right-hand side. Normally, any type variable appearing on the right must also appear on the left […]
Example set-up:
The following pseudo-code is not quite valid Java, even though it would be easy enough to fix that. In fact, that's exactly what I'm going to do in this answer!
class Tree<α>
{
α value;
Tree<α> left;
Tree<α> right;
}
int height(Tree<α> t)
{
return (t != null) ? 1 + max( height(t.left), height(t.right) )
: 0;
}
Let me briefly spell this out for you. We are defining…
a recursive type Tree<α> which represents a node in a binary tree. Each node stores a value of some type α and has references to optional left and right subtrees of the same type.
a function height which returns the furthest distance from any leaf node to the root node t.
Now, let's turn the above pseudo-code for height into proper Java syntax! (I'll keep on omitting some boilerplate for brevity's sake, such as object-orientation and accessibility modifiers.) I'm going to show two possible solutions.
1. Universal type solution:
The most obvious fix is to simply make height generic by introducing the type parameter α into its signature:
<α> int height(Tree<α> t)
{
return (t != null) ? 1 + max( height(t.left), height(t.right) )
: 0;
}
This would allow you to declare variables and create expressions of type α inside that function, if you wanted to. But...
2. Existential type solution:
If you look at our method's body, you will notice that we're not actually accessing, or working with, anything of type α! There are no expressions having that type, nor any variables declared with that type... so, why do we have to make height generic at all? Why can't we simply forget about α? As it turns out, we can:
int height(Tree<?> t)
{
return (t != null) ? 1 + max( height(t.left), height(t.right) )
: 0;
}
As I wrote at the very beginning of this answer, existential and universal types are complementary / dual in nature. Thus, if the universal type solution was to make height more generic, then we should expect that existential types have the opposite effect: making it less generic, namely by hiding/removing the type parameter α.
As a consequence, you can no longer refer to the type of t.value in this method nor manipulate any expressions of that type, because no identifier has been bound to it. (The ? wildcard is a special token, not an identifier that "captures" a type.) t.value has effectively become opaque; perhaps the only thing you can still do with it is type-cast it to Object.
Summary:
===========================================================
| universally existentially
| quantified type quantified type
---------------------+-------------------------------------
calling method |
needs to know | yes no
the type argument |
---------------------+-------------------------------------
called method |
can use / refer to | yes no
the type argument |
=====================+=====================================
These are all good examples, but I choose to answer it a little bit differently. Recall from math, that ∀x. P(x) means "for all x's, I can prove that P(x)". In other words, it is a kind of function, you give me an x and I have a method to prove it for you.
In type theory, we are not talking about proofs, but of types. So in this space we mean "for any type X you give me, I will give you a specific type P". Now, since we don't give P much information about X besides the fact that it is a type, P can't do much with it, but there are some examples. P can create the type of "all pairs of the same type": P<X> = Pair<X, X> = (X, X). Or we can create the option type: P<X> = Option<X> = X | Nil, where Nil is the type of the null pointers. We can make a list out of it: List<X> = (X, List<X>) | Nil. Notice that the last one is recursive, values of List<X> are either pairs where the first element is an X and the second element is a List<X> or else it is a null pointer.
Now, in math ∃x. P(x) means "I can prove that there is a particular x such that P(x) is true". There may be many such x's, but to prove it, one is enough. Another way to think of it is that there must exist a non-empty set of evidence-and-proof pairs {(x, P(x))}.
Translated to type theory: A type in the family ∃X.P<X> is a type X and a corresponding type P<X>. Notice that while before we gave X to P, (so that we knew everything about X but P very little) that the opposite is true now. P<X> doesn't promise to give any information about X, just that there there is one, and that it is indeed a type.
How is this useful? Well, P could be a type that has a way of exposing its internal type X. An example would be an object which hides the internal representation of its state X. Though we have no way of directly manipulating it, we can observe its effect by poking at P. There could be many implementations of this type, but you could use all of these types no matter which particular one was chosen.
To directly answer your question:
With the universal type, uses of T must include the type parameter X. For example T<String> or T<Integer>. For the existential type uses of T do not include that type parameter because it is unknown or irrelevant - just use T (or in Java T<?>).
Further information:
Universal/abstract types and existential types are a duality of perspective between the consumer/client of an object/function and the producer/implementation of it. When one side sees a universal type the other sees an existential type.
In Java you can define a generic class:
public class MyClass<T> {
// T is existential in here
T whatever;
public MyClass(T w) { this.whatever = w; }
public static MyClass<?> secretMessage() { return new MyClass("bazzlebleeb"); }
}
// T is universal from out here
MyClass<String> mc1 = new MyClass("foo");
MyClass<Integer> mc2 = new MyClass(123);
MyClass<?> mc3 = MyClass.secretMessage();
From the perspective of a client of MyClass, T is universal because you can substitute any type for T when you use that class and you must know the actual type of T whenever you use an instance of MyClass
From the perspective of instance methods in MyClass itself, T is existential because it doesn't know the real type of T
In Java, ? represents the existential type - thus when you are inside the class, T is basically ?. If you want to handle an instance of MyClass with T existential, you can declare MyClass<?> as in the secretMessage() example above.
Existential types are sometimes used to hide the implementation details of something, as discussed elsewhere. A Java version of this might look like:
public class ToDraw<T> {
T obj;
Function<Pair<T,Graphics>, Void> draw;
ToDraw(T obj, Function<Pair<T,Graphics>, Void>
static void draw(ToDraw<?> d, Graphics g) { d.draw.apply(new Pair(d.obj, g)); }
}
// Now you can put these in a list and draw them like so:
List<ToDraw<?>> drawList = ... ;
for(td in drawList) ToDraw.draw(td);
It's a bit tricky to capture this properly because I'm pretending to be in some sort of functional programming language, which Java isn't. But the point here is that you are capturing some sort of state plus a list of functions that operate on that state and you don't know the real type of the state part, but the functions do since they were matched up with that type already.
Now, in Java all non-final non-primitive types are partly existential. This may sound strange, but because a variable declared as Object could potentially be a subclass of Object instead, you cannot declare the specific type, only "this type or a subclass". And so, objects are represented as a bit of state plus a list of functions that operate on that state - exactly which function to call is determined at runtime by lookup. This is very much like the use of existential types above where you have an existential state part and a function that operates on that state.
In statically typed programming languages without subtyping and casts, existential types allow one to manage lists of differently typed objects. A list of T<Int> cannot contain a T<Long>. However, a list of T<?> can contain any variation of T, allowing one to put many different types of data into the list and convert them all to an int (or do whatever operations are provided inside the data structure) on demand.
One can pretty much always convert a record with an existential type into a record without using closures. A closure is existentially typed, too, in that the free variables it is closed over are hidden from the caller. Thus a language that supports closures but not existential types can allow you to make closures that share the same hidden state that you would have put into the existential part of an object.
An existential type is an opaque type.
Think of a file handle in Unix. You know its type is int, so you can easily forge it. You can, for instance, try to read from handle 43. If it so happens that the program has a file open with this particular handle, you'll read from it. Your code doesn't have to be malicious, just sloppy (e.g., the handle could be an uninitialized variable).
An existential type is hidden from your program. If fopen returned an existential type, all you could do with it is to use it with some library functions that accept this existential type. For instance, the following pseudo-code would compile:
let exfile = fopen("foo.txt"); // No type for exfile!
read(exfile, buf, size);
The interface "read" is declared as:
There exists a type T such that:
size_t read(T exfile, char* buf, size_t size);
The variable exfile is not an int, not a char*, not a struct File—nothing you can express in the type system. You can't declare a variable whose type is unknown and you cannot cast, say, a pointer into that unknown type. The language won't let you.
Seems I’m coming a bit late, but anyway, this document adds another view of what existential types are, although not specifically language-agnostic, it should be then fairly easier to understand existential types: http://www.cs.uu.nl/groups/ST/Projects/ehc/ehc-book.pdf (chapter 8)
The difference between a universally and existentially quantified type can be characterized by the following observation:
The use of a value with a ∀ quantified type determines the type to choose for the instantiation of the quantified type variable. For example, the caller of the identity function “id :: ∀a.a → a” determines the type to choose for the type variable a for this particular application of id. For the function application “id 3” this type equals Int.
The creation of a value with a ∃ quantified type determines, and hides, the type of the quantified type variable. For example, a creator of a “∃a.(a, a → Int)” may have constructed a value of that type from “(3, λx → x)”; another creator has constructed a value with the same type from “(’x’, λx → ord x)”. From a users point of view both values have the same type and are thus interchangeable. The value has a specific type chosen for type variable a, but we do not know which type, so this information can no longer be exploited. This value specific type information has been ‘forgotten’; we only know it exists.
A universal type exists for all values of the type parameter(s). An existential type exists only for values of the type parameter(s) that satisfy the constraints of the existential type.
For example in Scala one way to express an existential type is an abstract type which is constrained to some upper or lower bounds.
trait Existential {
type Parameter <: Interface
}
Equivalently a constrained universal type is an existential type as in the following example.
trait Existential[Parameter <: Interface]
Any use site can employ the Interface because any instantiable subtypes of Existential must define the type Parameter which must implement the Interface.
A degenerate case of an existential type in Scala is an abstract type which is never referred to and thus need not be defined by any subtype. This effectively has a shorthand notation of List[_] in Scala and List<?> in Java.
My answer was inspired by Martin Odersky's proposal to unify abstract and existential types. The accompanying slide aids understanding.
Research into abstract datatypes and information hiding brought existential types into programming languages. Making a datatype abstract hides info about that type, so a client of that type cannot abuse it. Say you've got a reference to an object... some languages allow you to cast that reference to a reference to bytes and do anything you want to that piece of memory. For purposes of guaranteeing behavior of a program, it's useful for a language to enforce that you only act on the reference to the object via the methods the designer of the object provides. You know the type exists, but nothing more.
See:
Abstract Types Have Existential Type, MITCHEL & PLOTKIN
http://theory.stanford.edu/~jcm/papers/mitch-plotkin-88.pdf
I created this diagram. I don't know if it's rigorous. But if it helps, I'm glad.
As I understand it's a math way to describe interfaces/abstract class.
As for T = ∃X { X a; int f(X); }
For C# it would translate to a generic abstract type:
abstract class MyType<T>{
private T a;
public abstract int f(T x);
}
"Existential" just means that there is some type that obey to the rules defined here.