Lexer in Haskell - How to Pattern Match specific case? - html

I'm currently working on a lexer written in Haskell, and am almost finished, but am running into a problem for a special case token. Currently, my lexer takes an input string and breaks down the statement into tokens for numbers, variable names, and specific tokens such as "if", "else", and "then".
It works great for all of my tokens, except for one that is "000...".
I was taught to use the span function, so I have my lexer use the isDigit and isAlphaNum boolean functions to parse the input. However, because "000..." starts with a zero, it automatically returns as a number. Additionally, the period is a token in the grammar as well, so the result of inputting "000..." in my lexer currently results in "0" "." "." ".".
I'm not proficient in the language of Haskell, but is it possible to match a string using isPrint, and use cases to handle instances of strings and integers? I'm at a loss for words right now, and it seems everything I have tried broke my program. My current pattern matching part looks like this:
lexer (c:cs)
| isSpace c = lexer cs
| isDigit c = lexDigit (c:cs)
| isAlphaNum c = lexString (c:cs)
| True = InvalidToken c : lexer cs
lexString
| s1 == "if" = IfToken : lexer s2
| s1 == "else" = ElseToken : lexer s2
| s1 == "then" = ThenToken : lexer s2
| s1 == "000..." = Zero : lexer s2
| True = StringToken s1 : lexer s2
where (s1,s2) = (span isAlphaNum cs)
Any help is appreicated!

First note that the idiomatic way to approach such a task in Haskell is to use a parser combinator library, such as parsec. (It may make sense to go the traditional parser/lexer route for some applications, but this isn't really something you should code by hand – use a lexer generator, i.e. alex.)
Now, if you determined to do this by hand, and without more expressive parser combinators... you'll need to handle that special case in lexDigit, rather than lexString:
lexDigit :: String -> [Token] -- Always use type signatures!
lexDigit cs
| ("000...",s2) <- splitAt 6 cs = Zero : lexer s2
lexDigit cs = ... -- your original definition of `lexDigit`
lexString :: String -> [Token]
lexString cs = case s1 of
"if" -> IfToken : lexer s2
"else" -> ElseToken : lexer s2
"then" -> ThenToken : lexer s2
-- no clause for "000...", since it can't happen here anyway
_ -> StringToken s1 : lexer s2
where (s1,s2) = (span isAlphaNum cs)
lexer :: String -> [Token]
lexer cs#(c:cs')
| isSpace c = lexer cs'
| isDigit c = lexDigit cs
| isAlphaNum c = lexString cs
| otherwise = InvalidToken c : lexer cs'

Related

F# error FS0588: The block following this 'let' is unfinished. Every code block is an expression and must have a result

I am tasked with finishing an interpreter in F#, but I'm having some trouble, as I im getting the error: error FS0588: The block following this 'let' is unfinished. Every code block is an expression and must have a result. 'let' cannot be the final code element in a block. Consider giving this block an explicit result.
Its been a long time since last time I programmed I F#.
The following is my code. I have a helper function inside my eval function, called OperateAux. It gets called in the pattern matching, when it matches e with OPERATE. It should then call OperateAux, and calculate the given expression. The error I'm getting is at line: let OperateAux (op:BINOP) (e1:EXP) (e2:EXP) : VALUE =
so I guess somehow my helper function isn't finished, I just cant figure out where.
let rec eval (vtab : SymTab) (e : EXP) : VALUE =
match e with
| CONSTANT n -> n
| VARIABLE v -> lookup v vtab
| OPERATE (op, e1, e2) -> OperateAux op e1 e2//(eval vtab e1) (eval vtab e2)
| LET_IN (var, e1, e2) -> failwith "case for LET_IN not handled"
| OVER (rop, var, e1, e2, e3) -> failwith "case for OVER not handled"
let OperateAux (op:BINOP) (e1:EXP) (e2:EXP) : VALUE =
let (INT e1) = eval vtab e1
let (INT e2) = eval vtab e2
match op with
| BPLUS -> (e1+e2)
| BMINUS -> (e1-e2)
| BTIMES -> (e1*e2)
| _ -> ()
Here is some types, I'm not sure if they are relevant for this question, but for good measure I'll show them.
type VALUE = INT of int
type BINOP = BPLUS | BMINUS | BTIMES
type RANGEOP = RSUM | RPROD | RMAX | RARGMAX
type EXP =
| CONSTANT of VALUE
| VARIABLE of string
| OPERATE of BINOP * EXP * EXP
| LET_IN of string * EXP * EXP
| OVER of RANGEOP * string * EXP * EXP * EXP
(* A list mapping variable names to their values. *)
type SymTab = (string * VALUE) list
Nevermind, I figured it out. You have to "initialise" your helper function before actually calling it. So the helper function operateAux should come before the pattern matching which calls it.

Defining many function values elegantly in haskell

I want to define a function that will capitalize all lowercase letters:
yell :: Char -> Char
yell 'a' = 'A'
yell 'b' = 'B'
...
yell 'z' = 'Z'
yell ch = ch
What's the best way to do this? I can make a list of pairs of the appropriate inputs and outputs via zip ['a'..'z'] ['A'..'Z'] but I'm not sure how to turn this into a definition of yell.
I know that lookup is something of an option but then I have to futz with Maybe, and I wonder if there is anything even more elementary available.
You can use a guard, and make use of toUpper :: Char -> Char, of the Data.Char module for example:
import Data.Char(toUpper)
yell :: Char -> Char
yell c
| 'a' <= c && c <= 'z' = toUpper c
| otherwise = c
for ASCII characters, the uppercase is just masking out the sixth bit (with 0010 0000 as mask). So toUpper is equivalent to chr . (~0x20 .&.) . ord for that specific range.
There are however other characters that have an uppercase variant such as characters with diacritics (àáâãäåæçèéêëìí…), Greek characters (αβγδεζηθικλ…), fullwidth characters (abcdefgh…), etc. These are all converted with toUpper, and can not (all) be converted with this trick.
You can perform a lookup with a lookup structure, like for example a `
import Data.HashMap.Strict(HashMap, fromList)
import qualified Data.HashMap.Strict as HM
items :: HashMap Char Char
items = fromList (zip ['a' .. 'z'] ['A' .. 'Z'])
yell :: Char -> Char
yell c
| Just y <- HM.lookup c items = y
| otherwise = c

OCaml : Raise an error inside a match with structure

In OCaml, I have a list of strings that contains names of towns (Something like "1-New York; 2-London; 3-Paris"). I need to ask the user to type a number (if they want London they have to type 2).
I want to raise an exception message saying that the town is not valid, if the person types for example "4", in the example.
I tried this, but it doesn't work :
let chosenTown = match int_of_string (input_line stdin) with
| x > (length listOfTowns) -> raise (Err "Not a valid town")
What's the good way to code "if the chosen number is bigger than the length of the list then raise the error" ??
Pattern can't contain arbitrary expressions. It can be a constant, a constructor name, record field inside curly braces, list, array, etc.
But patterns can be guarded, e.g.
match int_of_string (input_line stding) with
| x when x >= length listOfTowns ->
invalid_arg "the number is too large"
| x -> List.nth listOfTowns x
To complete the answer, patter matching relies on unification and does not expect assertion (it is not the equivalent of a switch in C or so).
The idea is that you provide different "shapes" (patterns) that your term (the thing you match on) could have.
For a list for instance:
match l with
| e :: e' :: r -> (*...*)
| e :: r -> (*...*)
| [] -> (*...*)
It also had a binding effect, if you pass on, say, [1] (a very small list indeed), it won't match e :: e' :: r, but will match e :: r and then e = 1 and r = [].
As ivg said, you can add conditions, as booleans this time, thanks to the keyword when.
However, when manipulating lists like this, I would go for a recursive function:
let rec find_town n l =
match l with
| t :: _ when n = 1 -> t
| _ :: r -> find_town (n-1) r
| [] -> raise (Err "Not a valid town")
This is basically writing again List.nth but changing the exception that it raises.

Proving lemma with implication based on functions

I want to prove the lemma below. I am trying to to use tactic 'destruct', but I
can't prove it. Please any body guide me how can I prove such lemmas. I can prove it for EmptyString, but not for variables s1 and s2. Thanks
Inductive nat : Set :=
| O : nat
| S : nat -> nat.
Inductive string : Set :=
| EmptyString : string
| String : ascii -> string -> string.
Fixpoint CompStrings (sa : string) (sb : string) {struct sb}: bool :=
match sa with
| EmptyString => match sb with
| EmptyString => true
| String b sb'=> false
end
| String a sa' => match sb with
| EmptyString => false
| String b sb'=> CompStrings sa' sb'
end
end.
Lemma Eq_lenght : forall (s1 s2 : string),
(CompStrings s1 s2) = true -> (Eq_nat (length s1) (length s2)) = true.
First off, let me argue about style. You could have written your function CompStrings as this:
Fixpoint CompStrings' (sa : string) (sb : string) {struct sb}: bool :=
match sa, sb with
| EmptyString, EmptyString => true
| EmptyString, _
| _, EmptyString => false
| String a sa', String b sb'=> CompStrings sa' sb'
end.
I find it easier to read. Here is a proof it's the same as yours, in case you're suspicious:
Theorem CompStrings'ok: forall sa sb, CompStrings sa sb = CompStrings' sa sb.
Proof.
intros. destruct sa, sb; simpl; reflexivity.
Qed.
Now, this will be a two-fold answer. First I'm just going to hint you at the direction for the proof. Then, I'll give you a full proof that I encourage you not to read before you've tried it yourself.
First off, I assumed this definition of length since you did not provide it:
Fixpoint length (s: string): nat :=
match s with
| EmptyString => O
| String _ rest => S (length rest)
end.
And since I did not have Eq_nat either, I proceeded to prove that the lengths are propositionally equal. It should be fairly trivial to adapt to Eq_nat.
Lemma Eq_length' : forall (s1 s2 : string),
CompStrings s1 s2 = true ->
length s1 = length s2.
Proof.
induction s1.
(* TODO *)
Admitted.
So here is the start! You want to prove a property about the inductive data type string. The thing is, you will want to proceed by case analysis, but if you just do it with destructs, it'll never end. This is why we proceed by induction. That is, you will need to prove that if s1 is the EmptyString, then the property holds, and that if the property holds for a substring, then it holds for the string with one character added. The two cases are fairly simple, in each case you can proceed by case analysis on s2 (that is, using destruct).
Note that I did not do intros s1 s2 C. before doing induction s1.. This is fairly important for one reason: if you do it (try!), your induction hypothesis will be too constrained as it will talk about one particular s2, rather than being quantified by it. This can be tricky when you start doing proofs by induction. So, be sure to try to continue this proof:
Lemma Eq_length'_will_fail : forall (s1 s2 : string),
CompStrings s1 s2 = true ->
length s1 = length s2.
Proof.
intros s1 s2 C. induction s1.
(* TODO *)
Admitted.
eventually, you'll find that your induction hypothesis can't be applied to your goal, because it's speaking about a particular s2.
I hope you've tried these two exercises.
Now if you're stuck, here is one way to prove the first goal.
Don't cheat! :)
Lemma Eq_length' : forall (s1 s2 : string),
CompStrings s1 s2 = true ->
length s1 = length s2.
Proof.
induction s1.
intros s2 C. destruct s2. reflexivity. inversion C.
intros s2 C. destruct s2. inversion C. simpl in *. f_equal.
exact (IHs1 _ C).
Qed.
To put that in intelligible terms:
let's prove the property forall s2, CompStrings s1 s2 = true -> length s1 = s2 by induction on s1:
in the case where s1 is the EmptyString, let's look at the shape of s2:
s2 is the EmptyString, then both lengths are equal to 0, so reflexivity.;
s2 is a String _ _, so there is a contradiction in the hypothesis, shown by inversion C.;
in the case where s1 is a String char1 rest1, let's look at the shape of s2, supposing the property true for rest:
s2 is the EmptyString, so there is a contradiction in the hypothesis, show by inversion C.;
s2 is a String char2 rest2, then length s1 = S (length rest1) and length s2 = S (length rest2), therefore we need to prove S (length rest1) = S (length rest2). Also, the hypothesis C simplifies into C: CompStrings rest1 rest2 = true. It is the perfect occasion to use the induction hypothesis to prove that length rest1 = length rest2, and then use that result somehow to prove the goal.
Note that for that last step, there are many ways to proceed to prove S (length rest1) = S (length rest2). One of which is using f_equal. which asks you to prove a pairwise equality between the parameters of the constructor. You could also use a rewrite (IHs1 _ C). then use reflexivity on that goal.
Hopefully this will help you not only solve this particular goal, but get a first understanding at proofs by induction!
To close on this, here are two interesting links.
This presents the basics of induction (see paragraph "Induction on lists").
This explains, better than me, why and how to generalize your induction hypotheses. You'll learn how to solve the goal where I did intros s1 s2 C. by putting back the s2 in the goal before starting the induction, using the tactic generalize (dependent).
In general, I'd recommend reading the whole book. It's slow-paced and very didactic.

How do I create Haskell functions that return functions?

I would like to create three Haskell functions: a, b, and c.
Each function is to have one argument. The argument is one of the three functions.
I would like function a to have this behavior:
if the argument is function a then return function a.
if the argument is function b then return function b.
if the argument is function c then return function a.
Here's a recap of the behavior I desire for function a:
a a = a
a b = c
a c = a
And here's the behavior I desire for the other two functions:
b a = a
b b = a
b c = c
c a = c
c b = b
c c = c
Once created, I would like to be able to compose the functions in various ways, for example:
a (c b)
= a (b)
= c
How do I create these functions?
Since you have given no criteria for how you are going to observe the results, then a = b = c = id satisfies your criteria. But of course that is not what you want. But the idea is important: it doesn't just matter what behavior you want your functions to have, but how you are going to observe that behavior.
There is a most general model if you allow some freedom in the notation, and you get this by using an algebraic data type:
data F = A | B | C
deriving (Eq, Show) -- ability to compare for equality and print
infixl 1 %
(%) :: F -> F -> F
A % A = A
A % B = C
A % C = A
B % A = A
...
and so on. Instead of saying a b, you have to say A % B, but that is the only difference. You can compose them:
A % (C % B)
= A % B
= B
and you can turn them into functions by partially applying (%):
a :: F -> F
a = (A %)
But you cannot compare this a, as ehird says. This model is equivalent to the one you specified, it just looks a little different.
This is impossible; you can't compare functions to each other, so there's no way to check if your argument is a, b, c or something else.
Indeed, it would be impossible for Haskell to let you check whether two functions are the same: since Haskell is referentially transparent, substituting two different implementations of the same function should have no effect. That is, as long as you give the same input for every output, the exact implementation of a function shouldn't matter, and although proving that \x -> x+x and \x -> x*2 are the same function is easy, it's undecidable in general.
Additionally, there's no possible type that a could have if it's to take itself as an argument (sure, id id types, but id can take anything as its first argument — which means it can't examine it in the way you want to).
If you're trying to achieve something with this (rather than just playing with it out of curiosity — which is fine, of course), then you'll have to do it some other way. It's difficult to say exactly what way that would be without concrete details.
Well, you can do it like this:
{-# LANGUAGE MagicHash #-}
import GHC.Prim
import Unsafe.Coerce
This function is from ehird's answer here:
equal :: a -> a -> Bool
equal x y = x `seq` y `seq`
case reallyUnsafePtrEquality# x y of
1# -> True
_ -> False
Now, let's get to business. Notice that you need to coerce the arguments and the return values as there is no possible type these functions can really have, as ehird pointed out.
a,b,c :: x -> y
a x | unsafeCoerce x `equal` a = unsafeCoerce a
| unsafeCoerce x `equal` b = unsafeCoerce c
| unsafeCoerce x `equal` c = unsafeCoerce a
b x | unsafeCoerce x `equal` a = unsafeCoerce a
| unsafeCoerce x `equal` b = unsafeCoerce a
| unsafeCoerce x `equal` c = unsafeCoerce c
c x | unsafeCoerce x `equal` a = unsafeCoerce c
| unsafeCoerce x `equal` b = unsafeCoerce b
| unsafeCoerce x `equal` c = unsafeCoerce c
Finally, some tests:
test = a (c b) `equal` c -- Evaluates to True
test' = a (c b) `equal` a -- Evaluates to False
Ehh...
As noted, functions can't be compared for equality. If you simply want functions that satisfy the algebraic laws in your specificiation, making them all equal to the identity function will do nicely.
I hope you are aware that if you post a homework-related question to Stack Overflow, the community expects you to identify it as such.