Does the term "monad" apply to values of types like Maybe or List, or does it instead apply only to the types themselves? - terminology

I've noticed that the word "monad" seems to be used in a somewhat inconsistent way. I've come to believe that this is because many (if not most) of the monad tutorials out there are written by folks who have only just started to figure monads out themselves (eg: nuclear waste spacesuit burritos), and so the term ends up getting kind of overloaded/corrupted.
In particular, I'm wondering whether the term "monad" can be applied to individual values of types like Maybe, List or IO, or if the term "monad" should really only be applied to the types themselves.
This is a subtle distinction, so perhaps an analogy might make it more clear. In mathematics we have, rings, fields, groups, etc. These terms apply to an entire set of values along with the operations that can be performed on them, rather than to individual elements. For example, integers (along with the operations of addition, negation and multiplication) form a ring. You could say "Integer is a ring", but you would never say "5 is a ring".
So, can you say "Just 5 is a monad", or would that be as wrong as saying "5 is a ring"? I don't know category theory, but I'm under the impression that it really only makes sense to say "Maybe is a monad" and not "Just 5 is a monad".

"Monad" (and "Functor") are popularly misused as describing values.
No value is a monad, functor, monoid, applicative functor, etc.
Only types & type constructors (higher-kinded types) can be.
When you hear (and you will) that "lists are monoids" or "functions are monads", etc, or "this function takes a monad as an argument", don't believe it.
Ask the speaker "How can any value be a monoid (or monad or ...), considering that Haskells classes classify types (including higher-order ones) rather than values?"
Lists are not monoids (etc). List a is.
My guess is that this popular misuse stems from mainstream languages having value classes and not type classes, so that habitual, unconscious value-class thinking sneaks in.
Why does it matter whether we use language precisely?
Because we think in language and we build & convey understandings via language.
So in order to have clear thoughts, it helps to have clear language (or be able to at any time).
"The slovenliness of our language makes it easier for us to have foolish thoughts. The point is that the process is reversible." - George Orwell, Politics and the English Language
Edit: These remarks apply to Haskell, not to the more general setting of category theory.

List is a monad, List a is a type, and [] is a List a (an element of a type).
Technically, a monad is a functor with extra structure; and in Haskell we only use functors from the category of Haskell types to itself.
It is thus in particular a "function" which takes a type and returns another type (it has kind * -> *).
List, State s, Maybe, etc are monads. State is not a monad, since it has kind * -> * -> *.
(aside: to confuse matters, Monads are just functors, and if I give myself a partially ordered set A, then it forms a category, with Hom(a, b) = { 1 element } if a <= b and Hom(a, b) = empty otherwise. Now any increasing function f : A -> A forms a functor, and monads are those functions which satisfy x <= f(x) and f(f(x)) <= f(x), hence f(f(x)) = f(x) -- monads here are technically "elements of A -> A". See also closure operators.)
(aside 2: since you appear to know some mathematics, I encourage you to read about category theory. You'll see among others that algebraic structures can be seen as arising from monads. See this excellent blog entry from the excellent blog by Dan Piponi for a teaser.)

To be exact, monads are structures from category theory. They don't have a direct code counterpart. For simplicity let's talk about general functors instead of monads. In the case of Haskell roughly speaking a functor is a mapping from a class of types to a class of types that also maps functions in the first class to functions in the second. The Functor instance gives you access to the mapping function, but doesn't directly capture the concept of functors.
It is however fair to say that the type constructor as mentioned in the Functor instance is the actual functor:
instance Functor Tree
In this case Tree is the functor. However, because Tree is a type constructor it can't stand for both mapping functions that make a functor at the same time. The function that maps functions is called fmap. So if you want to be precise you have to say that the tuple (Tree, fmap) is the functor, where fmap is the particular fmap from Tree's Functor instance. For convenience, again, we say that Tree is the functor, because the corresponding fmap follows from its Functor instance.
Note that functors are always types of kind * -> *. So Maybe Int is not a functor – the functor is Maybe. Also people often talk about "the state monad", which is also imprecise. State is a whole family of infinitely many state monads, as you can see in the instance:
instance Monad (State s)
For every type s the type constructor State s (of kind * -> *) is a state monad, one of many.

So, can you say "Just 5 is a monad", or would that be as wrong as saying "5 is a ring"?
Your intuition is exactly right. Int is to Ring (or AbelianGroup or whatever) as Maybe is to Monad (or Functor or whatever). Values (5, Just 5, etc.) are unimportant.
In algebra, we say the set of integers form a ring; in Haskell we would say (informally) that Int is a member of the Ring typeclass, or (slightly more formally) that there exists a Ring instance for Int. You might find this proposal fun and/or useful. Anyway, same deal with monads.
I don't know category theory, but ...
Whatever, if you know a thing or two about abstract algebra, you're golden.

I would say "Just 5 is of a type that is an instance of a Monad" like i would say "5 is a number that has type (Integer) is a ring".
I use the term instance because is how in Haskell you declare an implementation of a typeclass, and Monad is one of them.

Related

Advantage of Arrows over Functions

What is the advantage of arrows over regular functions in haskell. What can they do the functions can't. Functions can map over structures using fmap.
On more of a broad picture, arrows get you out of Hask and into other categories there are to explore. The Kleisli category is probably the best-acquainted to Haskellers, followed by Cokleisli. Those are the natural "extensions" of Hask: add an endofunctor around either the result or argument, then you get a category again if
Kleisli: the functor is a monad, so id ≅ return :: a -> m a
(.) ≅ (<=<) :: (b->m c) -> (a->m b) -> a->m c
CoKleisli: the functor is a comonad, so id ≅ coreturn :: m a -> a and
(.) :: (m b->c) -> (m a->b) -> m a->c
(For that you don't need Arrow yet, only Category. But general categories aren't very interesting, you normally want monoidal or even cartesian closed categories, which is what Arrow is roughly aiming at.)
But there are sure are lots of other categories. Most don't have much to do with Hask and can't be expressed with the standard Arrow class, mainly because the objects have special properties that not every Haskell type fulfills. Actually, if you add the ability to constrain the object types, the possibilities immediately become much wider. But even if you stay with the standard classes, perhaps even simply in ->, the point-free composition-style that is natural with arrows often comes out very nice, concise, and opens up new ways to think about transformations.
Functions are only an instance of arrows, it's like asking "Why use monads instead of just Maybe".
Anything you can do with arrows can of course be done with functions since the Arrow (->) instance can only talk about one small part of functions, namely what's in the Arrow type class. However, arrows has more instances than just plain functions, so we can use the ssame functions to operate on more complex types.
Arrows are nice since they can have a lot more structure than just a function, when traversing with just fmap, we have no way to accumulate effects, are more expressive than monads! Consider the Kleisli arrow,
newtype Kleisli m a b = Kleisli {runKleisli :: a -> m b}
This forms an arrow when m is a monad. So every Monad forms an arrow and thus we can build up monadic computations by seamlessly composing a -> m b's and do all sorts of useful things like this. Some XML libraries use arrows to abstract over functions from an element to it's subelements and use this to traverse over the document. Other parsers use arrows (their original purpose) though nowadays this seems to be falling out of favor for Applicative.
The point that you've hopefully noticed is that arrows are more generic, when we just talk about arrows, we avoid duplicating all the code that we would need to write to do something with our parsers, xml scrapers, and monadic functions!
It's just the same as opting for Monad over Maybe, we lose some power since we're no longer able to make specific statements, but we get more generic code in return.

In which languages is function abstraction not primitive

In Haskell function type (->) is given, it's not an algebraic data type constructor and one cannot re-implement it to be identical to (->).
So I wonder, what languages will allow me to write my version of (->)? How does this property called?
UPD Reformulations of the question thanks to the discussion:
Which languages don't have -> as a primitive type?
Why -> is necessary primitive?
I can't think of any languages that have arrows as a user defined type. The reason is that arrows -- types for functions -- are baked in to the type system, all the way down to the simply typed lambda calculus. That the arrow type must fundamental to the language comes directly from the fact that the way you form functions in the lambda calculus is via lambda abstraction (which, at the type level, introduces arrows).
Although Marcin aptly notes that you can program in a point free style, this doesn't change the essence of what you're doing. Having a language without arrow types as primitives goes against the most fundamental building blocks of Haskell. (The language you reference in the question.)
Having the arrow as a primitive type also shares some important ties to constructive logic: you can read the function arrow type as implication from intuition logic, and programs having that type as "proofs." (Namely, if you have something of type A -> B, you have a proof that takes some premise of type A, and produces a proof for B.)
The fact that you're perturbed by the use of having arrows baked into the language might imply that you're not fundamentally grasping why they're so tied to the design of the language, perhaps it's time to read a few chapters from Ben Pierce's "Types and Programming Languages" link.
Edit: You can always look at languages which don't have a strong notion of functions and have their semantics defined with respect to some other way -- such as forth or PostScript -- but in these languages you don't define inductive data types in the same way as in functional languages like Haskell, ML, or Coq. To put it another way, in any language in which you define constructors for datatypes, arrows arise naturally from the constructors for these types. But in languages where you don't define inductive datatypes in the typical way, you don't get arrow types as naturally because the language just doesn't work that way.
Another edit: I will stick in one more comment, since I thought of it last night. Function types (and function abstraction) forms the basis of pretty much all programming languages -- at least at some level, even if it's "under the hood." However, there are languages designed to define the semantics of other languages. While this doesn't strictly match what you're talking about, PLT Redex is one such system, and is used for specifying and debugging the semantics of programming languages. It's not super useful from a practitioners perspective (unless your goal is to design new languages, in which case it is fairly useful), but maybe that fits what you want.
Do you mean meta-circular evaluators like in SICP? Being able to write your own DSL? If you create your own "function type", you'll have to take care of "applying" it, yourself.
Just as an example, you could create your own "function" in C for instance, with a look-up table holding function pointers, and use integers as functions. You'd have to provide your own "call" function for such "functions", of course:
void call( unsigned int function, int data) {
lookup_table[function](data);
}
You'd also probably want some means of creating more complex functions from primitive ones, for instance using arrays of ints to signify sequential execution of your "primitive functions" 1, 2, 3, ... and end up inventing whole new language for yourself.
I think early assemblers had no ability to create callable "macros" and had to use GOTO.
You could use trampolining to simulate function calls. You could have only global variables store, with shallow binding perhaps. In such language "functions" would be definable, though not primitive type.
So having functions in a language is not necessary, though it is convenient.
In Common Lisp defun is nothing but a macro associating a name and a callable object (though lambda is still a built-in). In AutoLisp originally there was no special function type at all, and functions were represented directly by quoted lists of s-expressions, with first element an arguments list. You can construct your function through use of cons and list functions, from symbols, directly, in AutoLisp:
(setq a (list (cons 'x NIL) '(+ 1 x)))
(a 5)
==> 6
Some languages (like Python) support more than one primitive function type, each with its calling protocol - namely, generators support multiple re-entry and returns (even if syntactically through the use of same def keyword). You can easily imagine a language which would let you define your own calling protocol, thus creating new function types.
Edit: as an example consider dealing with multiple arguments in a function call, the choice between automatic currying or automatical optional args etc. In Common LISP say, you could easily create yourself two different call macros to directly represent the two calling protocols. Consider functions returning multiple values not through a kludge of aggregates (tuples, in Haskell), but directly into designated recepient vars/slots. All are different types of functions.
Function definition is usually primitive because (a) functions are how programmes get things done; and (b) this sort of lambda-abstraction is necessary to be able to programme in a pointful style (i.e. with explicit arguments).
Probably the closest you will come to a language that meets your criteria is one based on a purely pointfree model which allows you to create your own lambda operator. You might like to explore pointfree languages in general, and ones based on SKI calculus in particular: http://en.wikipedia.org/wiki/SKI_combinator_calculus
In such a case, you still have primitive function types, and you always will, because it is a fundamental element of the type system. If you want to get away from that at all, probably the best you could do would be some kind of type system based on a category-theoretic generalisation of functions, such that functions would be a special case of another type. See http://en.wikipedia.org/wiki/Category_theory.
Which languages don't have -> as a primitive type?
Well, if you mean a type that can be named, then there are many languages that don't have them. All languages where functions are not first class citiziens don't have -> as a type you could mention somewhere.
But, as #Kristopher eloquently and excellently explained, functions are (or can, at least, perceived as) the very basic building blocks of all computation. Hence even in Java, say, there are functions, but they are carefully hidden from you.
And, as someone mentioned assembler - one could maintain that the machine language (of most contemporary computers) is an approximation of the model of the register machine. But how it is done? With millions and billions of logical circuits, each of them being a materialization of quite primitive pure functions like NOT or NAND, arranged in a certain physical order (which is, obviously, the way hardware engeniers implement function composition).
Hence, while you may not see functions in machine code, they're still the basis.
In Martin-Löf type theory, function types are defined via indexed product types (so-called Π-types).
Basically, the type of functions from A to B can be interpreted as a (possibly infinite) record, where all the fields are of the same type B, and the field names are exactly all the elements of A. When you need to apply a function f to an argument x, you look up the field in f corresponding to x.
The wikipedia article lists some programming languages that are based on Martin-Löf type theory. I am not familiar with them, but I assume that they are a possible answer to your question.
Philip Wadler's paper Call-by-value is dual to call-by-name presents a calculus in which variable abstraction and covariable abstraction are more primitive than function abstraction. Two definitions of function types in terms of those primitives are provided: one implements call-by-value, and the other call-by-name.
Inspired by Wadler's paper, I implemented a language (Ambidexer) which provides two function type constructors that are synonyms for types constructed from the primitives. One is for call-by-value and one for call-by-name. Neither Wadler's dual calculus nor Ambidexter provides user-defined type constructors. However, these examples show that function types are not necessarily primitive, and that a language in which you can define your own (->) is conceivable.
In Scala you can mixin one of the Function traits, e.g. a Set[A] can be used as A => Boolean because it implements the Function1[A,Boolean] trait. Another example is PartialFunction[A,B], which extends usual functions by providing a "range-check" method isDefinedAt.
However, in Scala methods and functions are different, and there is no way to change how methods work. Usually you don't notice the difference, as methods are automatically lifted to functions.
So you have a lot of control how you implement and extend functions in Scala, but I think you have a real "replacement" in mind. I'm not sure this makes even sense.
Or maybe you are looking for languages with some kind of generalization of functions? Then Haskell with Arrow syntax would qualify: http://www.haskell.org/arrows/syntax.html
I suppose the dumb answer to your question is assembly code. This provides you with primitives even "lower" level than functions. You can create functions as macros that make use of register and jump primitives.
Most sane programming languages will give you a way to create functions as a baked-in language feature, because functions (or "subroutines") are the essence of good programming: code reuse.

Is it possible to unify the concepts of inheritance and parametric polymorphism?

I wonder if it is generally possible to unify the concepts of inheritance and parametric polymorphism ("generics"), especially regarding variance but also in terms how ("syntax") and where (use-site/declaration-site) they would have to be defined?
Consider this point of view:
Sub-typing e. g. S <: T can be perceived as co-variant behavior, because input arguments accepting T will also accept S.
Changing the "variance of the inheritance model" to invariant is only possible at definition-side by disallowing sub-typing (e. g. adding a final modifier to a class definition), contra-variance is not possible as far as I have seen in most cases
Parametric polymorphism is invariant by default, but can be made co-/contra-variant
There seems to be a non-negligible concept mismatch between both, considering
the pains languages have generated by allowing "unsafe" covariance (e. g. String[] <: Object[] in Java/C#)
the differences in how inheritance/parametric polymorphism is declared and used compared to inheritance
In some languages it can be seen that both work together nicely though, like
class Foo extends Ordered[Foo]
to implement ordering/comparison behaviour.
Is it conceivable that the concepts of inheritance and parametric polymorphism could be unified and gain the same default variance behavior (e. g. covariance by default or would that cause the necessity to mark most types with an invariance annotation instead, therefore just moving the ugliness to another point)? Would this be more practical as if data structures would become immutable by default, too?
Is there a formal system in which this has been proven to be sound?
Which syntax options/changes would be most likely necessary, regardless of a concrete programming language?
Is there some working example or a language where this/something similar is already working?
By covariance/contravariance, one usually means this. Suppose X, Y, Z are types. Suppose further that a → b denotes a function type with an argument of type a and a result of type b. <: denotes the subtype relation, or perhaps some other notion of "conformance". The ⇒ arrow reads "entails". Then the following holds:
X <: Y ⇒ (Z → X) <: (Z → Y)
X <: Y ⇒ (Y → Z) <: (X → Z)
That is, the function type constructor is covariant with respect to the result type (data source), and contravariant with respect to the argument type (data sink). This is a basic fact and you more or less cannot do anything too creative about it, like reversing the directions of arrows. Of course you can always use no-variance in place of co- or contravariance (most languages do).
Object types can be canonically encoded with function types, so there's not too much freedom here either. Every type parameter represents either data source (covariant) or data sink (contravariant) or both (novariant). If it's sound and contravariant in one language, then in another language it's going to be either contravariant or unsound.
I think Scala is pretty close to an ideal language in this respect. You cite an example that looks a lot like Scala, so you are most likely familiar with the language. I wonder why you think that its type system works nicely only in some instances. What are the other instances?
One theoretical work that every aspiring language designer should read is "A Theory of Objects" by Luca Cardelli.

Can you implement any pure LISP function using the ten primitives? (ie no type predicates)

This site makes the following claim:
http://hyperpolyglot.wikidot.com/lisp#ten-primitives
McCarthy introduced the ten primitives of lisp in 1960. All other pure lisp
functions (i.e. all functions which don't do I/O or interact with the environment)
can be implemented with these primitives. Thus, when implementing or porting lisp,
these are the only functions which need to be implemented in a lower language. The
way the non-primitives of lisp can be constructed from primitives is analogous to
the way theorems can be proven from axioms in mathematics.
The primitives are: atom, quote, eq, car, cdr, cons, cond, lambda, label, apply.
My question is - can you really do this without type predicates such as numberp? Surely there is a point when writing a higher level function that you need to do a numeric operation - which the primitives above don't allow for.
Some numbers can be represented with just those primitives, it's just rather inconvenient and difficult the conceptualize the first time you see it.
Similar to how the natural numbers are represented with sets increasing in size, they can be simulated in Lisp as nested cons cells.
Zero would be the empty list, or (). One would be the singleton cons cell, or (() . ()). Two would be one plus one, or the successor of one, where we define the successor of x to be (cons () x) , which is of course (() . (() . ())). If you accept the Infinity Axiom (and a few more, but mostly the Infinity Axiom for our purposes so far), and ignore the memory limitations of real computers, this can accurately represent all the natural numbers.
It's easy enough to extend this to represent all the integers and then the rationals [1], but representing the reals in this notation would be (I think) impossible. Fortunately, this doesn't dampen our fun, as we can't represent the all the reals on our computers anyway; we make do with floats and doubles. So our representation is just as powerful.
In a way, 1 is just syntactic sugar for (() . ()).
Hurray for set theory! Hurray for Lisp!
EDIT Ah, for further clarification, let me address your question of type predicates, though at this point it could be clear. Since your numbers have a distinct form, you can test these linked lists with a function of your own creation that tests for this particular structure. My Scheme isn't good enough anymore to write it in Scheme, but I can attempt to in Clojure.
Regardless, you may be saying that it could give you false positives: perhaps you're simply trying to represent sets and you end up having the same structure as a number in this system. To that I reply: well, in that case, you do in fact have a number.
So you can see, we've got a pretty decent representation of numbers here, aside from how much memory they take up (not our concern) and how ugly they look when printed at the REPL (also, not our concern) and how inefficient it will be to operate on them (e.g. we have to define our addition etc. in terms of list operations: slow and a bit complicated.) But none of these are out concern: the speed really should and could depend on the implementation details, not what you're doing this the language.
So here, in Clojure (but using only things we basically have access to in our simple Lisp, is numberp. (I hope; feel free to correct me, I'm groggy as hell etc. excuses etc.)
(defn numberp
[x]
(cond
(nil? x) true
(and (coll? x) (nil? (first x))) (numberp (second x))
:else false))
[1] For integers, represent them as cons cells of the naturals. Let the first element in the cons cell be the "negative" portion of the integer, and the second element be the "positive" portion of the integer. In this way, -2 can be represented as (2, 0) or (4, 2) or (5, 3) etc. For the rationals, let them be represented as cons cells of the integers: e.g. (-2, 3) etc. This does give us the possibility of having the same data structure representing the same number: however, this can be remedied by writing functions that test two numbers to see if they're equivalent: we'd define these functions in terms of the already-existing equivalence relations set theory offers us. Fun stuff :)

Examples of monoids/semigroups in programming

It is well-known that monoids are stunningly ubiquitous in programing. They are so ubiquitous and so useful that I, as a 'hobby project', am working on a system that is completely based on their properties (distributed data aggregation). To make the system useful I need useful monoids :)
I already know of these:
Numeric or matrix sum
Numeric or matrix product
Minimum or maximum under a total order with a top or bottom element (more generally, join or meet in a bounded lattice, or even more generally, product or coproduct in a category)
Set union
Map union where conflicting values are joined using a monoid
Intersection of subsets of a finite set (or just set intersection if we speak about semigroups)
Intersection of maps with a bounded key domain (same here)
Merge of sorted sequences, perhaps with joining key-equal values in a different monoid/semigroup
Bounded merge of sorted lists (same as above, but we take the top N of the result)
Cartesian product of two monoids or semigroups
List concatenation
Endomorphism composition.
Now, let us define a quasi-property of an operation as a property that holds up to an equivalence relation. For example, list concatenation is quasi-commutative if we consider lists of equal length or with identical contents up to permutation to be equivalent.
Here are some quasi-monoids and quasi-commutative monoids and semigroups:
Any (a+b = a or b, if we consider all elements of the carrier set to be equivalent)
Any satisfying predicate (a+b = the one of a and b that is non-null and satisfies some predicate P, if none does then null; if we consider all elements satisfying P equivalent)
Bounded mixture of random samples (xs+ys = a random sample of size N from the concatenation of xs and ys; if we consider any two samples with the same distribution as the whole dataset to be equivalent)
Bounded mixture of weighted random samples
Let's call it "topological merge": given two acyclic and non-contradicting dependency graphs, a graph that contains all the dependencies specified in both. For example, list "concatenation" that may produce any permutation in which elements of each list follow in order (say, 123+456=142356).
Which others do exist?
Quotient monoid is another way to form monoids (quasimonoids?): given monoid M and an equivalence relation ~ compatible with multiplication, it gives another monoid. For example:
finite multisets with union: if A* is a free monoid (lists with concatenation), ~ is "is a permutation of" relation, then A*/~ is a free commutative monoid.
finite sets with union: If ~ is modified to disregard count of elements (so "aa" ~ "a") then A*/~ is a free commutative idempotent monoid.
syntactic monoid: Any regular language gives rise to syntactic monoid that is quotient of A* by "indistinguishability by language" relation. Here is a finger tree implementation of this idea. For example, the language {a3n:n natural} has Z3 as the syntactic monoid.
Quotient monoids automatically come with homomorphism M -> M/~ that is surjective.
A "dual" construction are submonoids. They come with homomorphism A -> M that is injective.
Yet another construction on monoids is tensor product.
Monoids allow exponentation by squaring in O(log n) and fast parallel prefix sums computation. Also they are used in Writer monad.
The Haskell standard library is alternately praised and attacked for its use of the actual mathematical terms for its type classes. (In my opinion it's a good thing, since without it I'd never even know what a monoid is!). In any case, you might check out http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Monoid.html for a few more examples:
the dual of any monoid is a monoid: given a+b, define a new operation ++ with a++b = b+a
conjunction and disjunction of booleans
over the Maybe monad (aka "option" in Ocaml), first and last. That is,first (Just a) b = Just a
first Nothing b = band likewise for last
The latter is just the tip of the iceberg of a whole family of monoids related to monads and arrows, but I can't really wrap my head around these (other than simply monadic endomorphisms). But a google search on monads monoids turns up quite a bit.
A really useful example of a commutative monoid is unification in logic and constraint languages. See section 2.8.2.2 of 'Concepts, Techniques and Models of Computer Programming' for a precise definition of a possible unification algorithm.
Good luck with your language! I'm doing something similar with a parallel language, using monoids to merge subresults from parallel computations.
Arbitrary length Roman numeral value computation.
https://gist.github.com/4542999