Universal Quantification in Isabelle/HOL - proof

It has come to my attention that there are several ways to deal with universal quantification when working with Isabelle/HOL Isar. I am trying to write some proofs in a style that is suitable for undergraduate students to understand and reproduce (that's why I'm using Isar!) and I am confused about how to express universal quantification in a nice way.
In Coq for example, I can write forall x, P(x) and then I may say "induction x" and that will automatically generate goals according to the corresponding induction principle. However, in Isabelle/HOL Isar, if I want to directly apply an induction principle I must state the theorem without any quantification, like this:
lemma foo: P(x)
proof (induct x)
And this works fine as x is then treated as a schematic variable, as if it was universally quantified. However, it lacks the universal quantification in the statement which is not very educational. Another way I have fund is by using \<And> and \<forall>. However, I can not directly apply the induction principle if I state the lemma in this way, I have to first fix the universally quantified variables... which again seems inconvenient from an educational point of view:
lemma foo: \<And>x. P(x)
proof -
fix x
show "P(x)"
proof (induct x)
What is a nice proof pattern for expressing universal quantification that does not require me to explicitly fix variables before induction?

You can use induct_tac, case_tac, etc. These are the legacy variant of the induct/induction and cases methods used in proper Isar. They can operate on bound meta-universally-quantified variables in the goal state, like the x in your second example:
lemma foo: "⋀x. P(x :: nat)"
proof (induct_tac x)
One disadvantage of induct_tac over induction is that it does not provide cases, so you cannot just write case (Suc x) and then from Suc.IH and show ?case in your proof. Another disadvantage is that addressing bound variables is, in general, rather fragile, since their names are often generated automatically by Isabelle and may change when Isabelle changes. (not in the case you have shown above, of course)
This is one of the reasons why Isar proofs are preferred these days. I would strongly advise against showing your students ‘bad’ Isabelle with the intention that it is easier for them to understand.
The facts are these: free variables in a theorem statement in Isabelle are logically equivalent to universally-quantified variables and Isabelle automatically converts them to schematic variables after you have proven it. This convention is not unique to Isabelle; it is common in mathematics and logic, and it helps to reduce clutter. Isar in particular tries to avoid explicit use of the ⋀ operator in goal statements (i.e. have/show; they still appear in assume).
Or, in short: free variables in theorems are universally quantified by default. I doubt that students will find this hard to understand; I certainly did not when I started with Isabelle as a BSc student. In fact, I found it much more natural to state a theorem as xs # (ys # zs) = (xs # ys) # zs instead of ∀xs ys zs. xs # (ys # zs) = (xs # ys) # zs.

Related

when to use and when not to use pointfree style in haskell?

I just learned about pointfree style in Haskell and how it can help tidy up the code and make it easier to read. But sometimes they can make the code a bit too terse.
So, when I should I always use pointfree style and at what scenarios should I absolutely avoid pointfree style in Haskell?
As already commented, it's a matter of taste and there will always be edge cases where both styles are equally suited (or, indeed, a partially-pointed version is best). However, there are some cases where it's clear enough:
If a pointed expression can be η-reduced just like that, it's usually a good idea to do it.
f x = g (h x)
should better be written
f = g . h
If you want to memoise some computation before accepting some function parameters, you must keep these parameters out of the scope. For instance,
linRegression :: [(Double, Double)] -> Double -> Double
linRegression ps x = a * x + b
where a, b = -- expensive calculation of regression coefficients,
-- depending on the `ps`
isn't optimal performance-wise, because the coefficients will need to be recomputed for every x value received. If you make it point-free:
linRegression :: [(Double, Double)] -> Double -> Double
linRegression ps = (+b) . (a*)
where a, b = ...
this problem doesn't arise. (Perhaps GHC will in some cases figure this out by itself, but I wouldn't rely on it.)
Often though, it is better to make it pointed nevertheless, just not with an x in the same scope as a and b but bound by a dedicated lambda:
linRegression :: [(Double, Double)] -> Double -> Double
linRegression ps = \x -> a * x + b
where a, b = ...
If the point-free version is actually longer than the pointed version, I wouldn't use it. If you need to introduce tricks to get it point-free like flip and the Monad (a->) instance and this doesn't even make it shorter, then it will almost certainly be less readable than the pointed version.
My favorite answer comes from Richard Bird's Thinking Functionally with Haskell: pointfree style helps you reason about function composition while a pointed style helps you reason about function application.
If you find that a pointfree style is awkward for writing a particular function then you generally have two options:
Use a pointed style instead. Sometimes you do want to reason about application.
Redesign your function to be compositional in nature.
In my own programs, I've found that (2) often leads to a better design and that this design can then be more clearly expressed using a pointfree style. Pointfree style is not an end goal: it is a means to achieving a more compositional design.

Does the term "monad" apply to values of types like Maybe or List, or does it instead apply only to the types themselves?

I've noticed that the word "monad" seems to be used in a somewhat inconsistent way. I've come to believe that this is because many (if not most) of the monad tutorials out there are written by folks who have only just started to figure monads out themselves (eg: nuclear waste spacesuit burritos), and so the term ends up getting kind of overloaded/corrupted.
In particular, I'm wondering whether the term "monad" can be applied to individual values of types like Maybe, List or IO, or if the term "monad" should really only be applied to the types themselves.
This is a subtle distinction, so perhaps an analogy might make it more clear. In mathematics we have, rings, fields, groups, etc. These terms apply to an entire set of values along with the operations that can be performed on them, rather than to individual elements. For example, integers (along with the operations of addition, negation and multiplication) form a ring. You could say "Integer is a ring", but you would never say "5 is a ring".
So, can you say "Just 5 is a monad", or would that be as wrong as saying "5 is a ring"? I don't know category theory, but I'm under the impression that it really only makes sense to say "Maybe is a monad" and not "Just 5 is a monad".
"Monad" (and "Functor") are popularly misused as describing values.
No value is a monad, functor, monoid, applicative functor, etc.
Only types & type constructors (higher-kinded types) can be.
When you hear (and you will) that "lists are monoids" or "functions are monads", etc, or "this function takes a monad as an argument", don't believe it.
Ask the speaker "How can any value be a monoid (or monad or ...), considering that Haskells classes classify types (including higher-order ones) rather than values?"
Lists are not monoids (etc). List a is.
My guess is that this popular misuse stems from mainstream languages having value classes and not type classes, so that habitual, unconscious value-class thinking sneaks in.
Why does it matter whether we use language precisely?
Because we think in language and we build & convey understandings via language.
So in order to have clear thoughts, it helps to have clear language (or be able to at any time).
"The slovenliness of our language makes it easier for us to have foolish thoughts. The point is that the process is reversible." - George Orwell, Politics and the English Language
Edit: These remarks apply to Haskell, not to the more general setting of category theory.
List is a monad, List a is a type, and [] is a List a (an element of a type).
Technically, a monad is a functor with extra structure; and in Haskell we only use functors from the category of Haskell types to itself.
It is thus in particular a "function" which takes a type and returns another type (it has kind * -> *).
List, State s, Maybe, etc are monads. State is not a monad, since it has kind * -> * -> *.
(aside: to confuse matters, Monads are just functors, and if I give myself a partially ordered set A, then it forms a category, with Hom(a, b) = { 1 element } if a <= b and Hom(a, b) = empty otherwise. Now any increasing function f : A -> A forms a functor, and monads are those functions which satisfy x <= f(x) and f(f(x)) <= f(x), hence f(f(x)) = f(x) -- monads here are technically "elements of A -> A". See also closure operators.)
(aside 2: since you appear to know some mathematics, I encourage you to read about category theory. You'll see among others that algebraic structures can be seen as arising from monads. See this excellent blog entry from the excellent blog by Dan Piponi for a teaser.)
To be exact, monads are structures from category theory. They don't have a direct code counterpart. For simplicity let's talk about general functors instead of monads. In the case of Haskell roughly speaking a functor is a mapping from a class of types to a class of types that also maps functions in the first class to functions in the second. The Functor instance gives you access to the mapping function, but doesn't directly capture the concept of functors.
It is however fair to say that the type constructor as mentioned in the Functor instance is the actual functor:
instance Functor Tree
In this case Tree is the functor. However, because Tree is a type constructor it can't stand for both mapping functions that make a functor at the same time. The function that maps functions is called fmap. So if you want to be precise you have to say that the tuple (Tree, fmap) is the functor, where fmap is the particular fmap from Tree's Functor instance. For convenience, again, we say that Tree is the functor, because the corresponding fmap follows from its Functor instance.
Note that functors are always types of kind * -> *. So Maybe Int is not a functor – the functor is Maybe. Also people often talk about "the state monad", which is also imprecise. State is a whole family of infinitely many state monads, as you can see in the instance:
instance Monad (State s)
For every type s the type constructor State s (of kind * -> *) is a state monad, one of many.
So, can you say "Just 5 is a monad", or would that be as wrong as saying "5 is a ring"?
Your intuition is exactly right. Int is to Ring (or AbelianGroup or whatever) as Maybe is to Monad (or Functor or whatever). Values (5, Just 5, etc.) are unimportant.
In algebra, we say the set of integers form a ring; in Haskell we would say (informally) that Int is a member of the Ring typeclass, or (slightly more formally) that there exists a Ring instance for Int. You might find this proposal fun and/or useful. Anyway, same deal with monads.
I don't know category theory, but ...
Whatever, if you know a thing or two about abstract algebra, you're golden.
I would say "Just 5 is of a type that is an instance of a Monad" like i would say "5 is a number that has type (Integer) is a ring".
I use the term instance because is how in Haskell you declare an implementation of a typeclass, and Monad is one of them.

Clojure - test for equality of function expression?

Suppose I have the following clojure functions:
(defn a [x] (* x x))
(def b (fn [x] (* x x)))
(def c (eval (read-string "(defn d [x] (* x x))")))
Is there a way to test for the equality of the function expression - some equivalent of
(eqls a b)
returns true?
It depends on precisely what you mean by "equality of the function expression".
These functions are going to end up as bytecode, so I could for example dump the bytecode corresponding to each function to a byte[] and then compare the two bytecode arrays.
However, there are many different ways of writing semantically equivalent methods, that wouldn't have the same representation in bytecode.
In general, it's impossible to tell what a piece of code does without running it. So it's impossible to tell whether two bits of code are equivalent without running both of them, on all possible inputs.
This is at least as bad, computationally speaking, as the halting problem, and possibly worse.
The halting problem is undecidable as it is, so the general-case answer here is definitely no (and not just for Clojure but for every programming language).
I agree with the above answers in regards to Clojure not having a built in ability to determine the equivalence of two functions and that it has been proven that you can not test programs functionally (also known as black box testing) to determine equality due to the halting problem (unless the input set is finite and defined).
I would like to point out that it is possible to algebraically determine the equivalence of two functions, even if they have different forms (different byte code).
The method for proving the equivalence algebraically was developed in the 1930's by Alonzo Church and is know as beta reduction in Lambda Calculus. This method is certainly applicable to the simple forms in your question (which would also yield the same byte code) and also for more complex forms that would yield different byte codes.
I cannot add to the excellent answers by others, but would like to offer another viewpoint that helped me. If you are e.g. testing that the correct function is returned from your own function, instead of comparing the function object you might get away with just returning the function as a 'symbol.
I know this probably is not what the author asked for but for simple cases it might do.

Is it possible to unify the concepts of inheritance and parametric polymorphism?

I wonder if it is generally possible to unify the concepts of inheritance and parametric polymorphism ("generics"), especially regarding variance but also in terms how ("syntax") and where (use-site/declaration-site) they would have to be defined?
Consider this point of view:
Sub-typing e. g. S <: T can be perceived as co-variant behavior, because input arguments accepting T will also accept S.
Changing the "variance of the inheritance model" to invariant is only possible at definition-side by disallowing sub-typing (e. g. adding a final modifier to a class definition), contra-variance is not possible as far as I have seen in most cases
Parametric polymorphism is invariant by default, but can be made co-/contra-variant
There seems to be a non-negligible concept mismatch between both, considering
the pains languages have generated by allowing "unsafe" covariance (e. g. String[] <: Object[] in Java/C#)
the differences in how inheritance/parametric polymorphism is declared and used compared to inheritance
In some languages it can be seen that both work together nicely though, like
class Foo extends Ordered[Foo]
to implement ordering/comparison behaviour.
Is it conceivable that the concepts of inheritance and parametric polymorphism could be unified and gain the same default variance behavior (e. g. covariance by default or would that cause the necessity to mark most types with an invariance annotation instead, therefore just moving the ugliness to another point)? Would this be more practical as if data structures would become immutable by default, too?
Is there a formal system in which this has been proven to be sound?
Which syntax options/changes would be most likely necessary, regardless of a concrete programming language?
Is there some working example or a language where this/something similar is already working?
By covariance/contravariance, one usually means this. Suppose X, Y, Z are types. Suppose further that a → b denotes a function type with an argument of type a and a result of type b. <: denotes the subtype relation, or perhaps some other notion of "conformance". The ⇒ arrow reads "entails". Then the following holds:
X <: Y ⇒ (Z → X) <: (Z → Y)
X <: Y ⇒ (Y → Z) <: (X → Z)
That is, the function type constructor is covariant with respect to the result type (data source), and contravariant with respect to the argument type (data sink). This is a basic fact and you more or less cannot do anything too creative about it, like reversing the directions of arrows. Of course you can always use no-variance in place of co- or contravariance (most languages do).
Object types can be canonically encoded with function types, so there's not too much freedom here either. Every type parameter represents either data source (covariant) or data sink (contravariant) or both (novariant). If it's sound and contravariant in one language, then in another language it's going to be either contravariant or unsound.
I think Scala is pretty close to an ideal language in this respect. You cite an example that looks a lot like Scala, so you are most likely familiar with the language. I wonder why you think that its type system works nicely only in some instances. What are the other instances?
One theoretical work that every aspiring language designer should read is "A Theory of Objects" by Luca Cardelli.

Can you implement any pure LISP function using the ten primitives? (ie no type predicates)

This site makes the following claim:
http://hyperpolyglot.wikidot.com/lisp#ten-primitives
McCarthy introduced the ten primitives of lisp in 1960. All other pure lisp
functions (i.e. all functions which don't do I/O or interact with the environment)
can be implemented with these primitives. Thus, when implementing or porting lisp,
these are the only functions which need to be implemented in a lower language. The
way the non-primitives of lisp can be constructed from primitives is analogous to
the way theorems can be proven from axioms in mathematics.
The primitives are: atom, quote, eq, car, cdr, cons, cond, lambda, label, apply.
My question is - can you really do this without type predicates such as numberp? Surely there is a point when writing a higher level function that you need to do a numeric operation - which the primitives above don't allow for.
Some numbers can be represented with just those primitives, it's just rather inconvenient and difficult the conceptualize the first time you see it.
Similar to how the natural numbers are represented with sets increasing in size, they can be simulated in Lisp as nested cons cells.
Zero would be the empty list, or (). One would be the singleton cons cell, or (() . ()). Two would be one plus one, or the successor of one, where we define the successor of x to be (cons () x) , which is of course (() . (() . ())). If you accept the Infinity Axiom (and a few more, but mostly the Infinity Axiom for our purposes so far), and ignore the memory limitations of real computers, this can accurately represent all the natural numbers.
It's easy enough to extend this to represent all the integers and then the rationals [1], but representing the reals in this notation would be (I think) impossible. Fortunately, this doesn't dampen our fun, as we can't represent the all the reals on our computers anyway; we make do with floats and doubles. So our representation is just as powerful.
In a way, 1 is just syntactic sugar for (() . ()).
Hurray for set theory! Hurray for Lisp!
EDIT Ah, for further clarification, let me address your question of type predicates, though at this point it could be clear. Since your numbers have a distinct form, you can test these linked lists with a function of your own creation that tests for this particular structure. My Scheme isn't good enough anymore to write it in Scheme, but I can attempt to in Clojure.
Regardless, you may be saying that it could give you false positives: perhaps you're simply trying to represent sets and you end up having the same structure as a number in this system. To that I reply: well, in that case, you do in fact have a number.
So you can see, we've got a pretty decent representation of numbers here, aside from how much memory they take up (not our concern) and how ugly they look when printed at the REPL (also, not our concern) and how inefficient it will be to operate on them (e.g. we have to define our addition etc. in terms of list operations: slow and a bit complicated.) But none of these are out concern: the speed really should and could depend on the implementation details, not what you're doing this the language.
So here, in Clojure (but using only things we basically have access to in our simple Lisp, is numberp. (I hope; feel free to correct me, I'm groggy as hell etc. excuses etc.)
(defn numberp
[x]
(cond
(nil? x) true
(and (coll? x) (nil? (first x))) (numberp (second x))
:else false))
[1] For integers, represent them as cons cells of the naturals. Let the first element in the cons cell be the "negative" portion of the integer, and the second element be the "positive" portion of the integer. In this way, -2 can be represented as (2, 0) or (4, 2) or (5, 3) etc. For the rationals, let them be represented as cons cells of the integers: e.g. (-2, 3) etc. This does give us the possibility of having the same data structure representing the same number: however, this can be remedied by writing functions that test two numbers to see if they're equivalent: we'd define these functions in terms of the already-existing equivalence relations set theory offers us. Fun stuff :)