F# and statically checked union cases - html

Soon me and my brother-in-arms Joel will release version 0.9 of Wing Beats. It's an internal DSL written in F#. With it you can generate XHTML. One of the sources of inspiration have been the XHTML.M module of the Ocsigen framework. I'm not used to the OCaml syntax, but I do understand XHTML.M somehow statically check if attributes and children of an element are of valid types.
We have not been able to statically check the same thing in F#, and now I wonder if someone have any idea of how to do it?
My first naive approach was to represent each element type in XHTML as a union case. But unfortunately you cannot statically restrict which cases are valid as parameter values, as in XHTML.M.
Then I tried to use interfaces (each element type implements an interface for each valid parent) and type constraints, but I didn't manage to make it work without the use of explicit casting in a way that made the solution cumbersome to use. And it didn't feel like an elegant solution anyway.
Today I've been looking at Code Contracts, but it seems to be incompatible with F# Interactive. When I hit alt + enter it freezes.
Just to make my question clearer. Here is a super simple artificial example of the same problem:
type Letter =
| Vowel of string
| Consonant of string
let writeVowel =
function | Vowel str -> sprintf "%s is a vowel" str
I want writeVowel to only accept Vowels statically, and not as above, check it at runtime.
How can we accomplish this? Does anyone have any idea? There must be a clever way of doing it. If not with union cases, maybe with interfaces? I've struggled with this, but am trapped in the box and can't think outside of it.

It looks like that library uses O'Caml's polymorphic variants, which aren't available in F#. Unfortunately, I don't know of a faithful way to encode them in F#, either.
One possibility might be to use "phantom types", although I suspect that this could become unwieldy given the number of different categories of content you're dealing with. Here's how you could handle your vowel example:
module Letters = begin
(* type level versions of true and false *)
type ok = class end
type nok = class end
type letter<'isVowel,'isConsonant> = private Letter of char
let vowel v : letter<ok,nok> = Letter v
let consonant c : letter<nok,ok> = Letter c
let y : letter<ok,ok> = Letter 'y'
let writeVowel (Letter(l):letter<ok,_>) = sprintf "%c is a vowel" l
let writeConsonant (Letter(l):letter<_,ok>) = sprintf "%c is a consonant" l
end
open Letters
let a = vowel 'a'
let b = consonant 'b'
let y = y
writeVowel a
//writeVowel b
writeVowel y

Strictly speaking, if you want to distinguish between something at compile-time, you need to give it different types. In your example, you could define two types of letters and then the type Letter would be either the first one or the second one.
This is a bit cumbersome, but it's probably the only direct way to achieve what you want:
type Vowel = Vowel of string
type Consonant = Consonant of string
type Letter = Choice<Vowel, Consonant>
let writeVowel (Vowel str) = sprintf "%s is a vowel" str
writeVowel (Vowel "a") // ok
writeVowel (Consonant "a") // doesn't compile
let writeLetter = function
| Choice1Of2(Vowel str) -> sprintf "%s is a vowel" str
| Choice2Of2(Consonant str) -> sprintf "%s is a consonant" str
The Choice type is a simple discriminated union which can store either a value of the first type or a value of the second type - you could define your own discriminated union, but it is a bit difficult to come up with reasonable names for the union cases (due to the nesting).
Code Contracts allow you to specify properties based on values, which would be more appropriate in this case. I think they should work with F# (when creating F# application), but I don't have any experience with integrating them with F#.
For numeric types, you can also use units of measure, which allow you to add additional information to the type (e.g. that a number has a type float<kilometer>), but this isn't available for string. If it was, you could define units of measure vowel and consonant and write string<vowel> and string<consonant>, but units of measure focus mainly on numerical applications.
So, perhaps the best option is to rely on runtime-checks in some cases.
[EDIT] To add some details regarding the OCaml implementation - I think that the trick that makes this possible in OCaml is that it uses structural subtyping, which means (translated to the F# terms) that you can define discriminated union with some mebers (e.g. only Vowel) and then another with more members (Vowel and Consonant).
When you create a value Vowel "a", it can be used as an argument to functions taking either of the types, but a value Consonant "a" can be used only with functions taking the second type.
This unfrotunately cannot be easily added to F#, because .NET doesn't natively support structural subtyping (although it may be possible using some tricks in .NET 4.0, but that would have to be done by the compiler). So, I know understand your problem, but I don't have any good idea how to solve it.
Some form of structural subtyping can be done using static member constraints in F#, but since discriminated union cases aren't types from the F# point of view, I don't think it is usable here.

My humble suggestion is: if the type system does not easily support statically checking 'X', then don't go through ridiculous contortions trying to statically check 'X'. Just throw an exception at runtime. The sky will not fall, the world will not end.
Ridiculous contortions to gain static checking often come at the expense of complicating an API, and make error messages indecipherable, and cause other degradations at the seams.

You can use inline functions with statically-resolved type parameters to yield different types depending on context.
let inline pcdata (pcdata : string) : ^U = (^U : (static member MakePCData : string -> ^U) pcdata)
let inline a (content : ^T) : ^U = (^U : (static member MakeA : ^T -> ^U) content)
let inline br () : ^U = (^U : (static member MakeBr : unit -> ^U) ())
let inline img () : ^U = (^U : (static member MakeImg : unit -> ^U) ())
let inline span (content : ^T) : ^U = (^U : (static member MakeSpan : ^T -> ^U) content)
Take the br function, for example. It will produce a value of type ^U, which is statically resolved at compilation. This will only compile if ^U has a static member MakeBr. Given the example below, that could produce either a A_Content.Br or a Span_Content.Br.
You then define a set of types to represent legal content. Each exposes "Make" members for the content that it accepts.
type A_Content =
| PCData of string
| Br
| Span of Span_Content list
static member inline MakePCData (pcdata : string) = PCData pcdata
static member inline MakeA (pcdata : string) = PCData pcdata
static member inline MakeBr () = Br
static member inline MakeSpan (pcdata : string) = Span [Span_Content.PCData pcdata]
static member inline MakeSpan content = Span content
and Span_Content =
| PCData of string
| A of A_Content list
| Br
| Img
| Span of Span_Content list
with
static member inline MakePCData (pcdata : string) = PCData pcdata
static member inline MakeA (pcdata : string) = A_Content.PCData pcdata
static member inline MakeA content = A content
static member inline MakeBr () = Br
static member inline MakeImg () = Img
static member inline MakeSpan (pcdata : string) = Span [PCData pcdata]
static member inline MakeSpan content = Span content
and Span =
| Span of Span_Content list
static member inline MakeSpan (pcdata : string) = Span [Span_Content.PCData pcdata]
static member inline MakeSpan content = Span content
You can then create values...
let _ =
test ( span "hello" )
test ( span [pcdata "hello"] )
test (
span [
br ();
span [
br ();
a [span "Click me"];
pcdata "huh?";
img () ] ] )
The test function used there prints XML... This code shows that the values are reasonable to work with.
let rec stringOfAContent (aContent : A_Content) =
match aContent with
| A_Content.PCData pcdata -> pcdata
| A_Content.Br -> "<br />"
| A_Content.Span spanContent -> stringOfSpan (Span.Span spanContent)
and stringOfSpanContent (spanContent : Span_Content) =
match spanContent with
| Span_Content.PCData pcdata -> pcdata
| Span_Content.A aContent ->
let content = String.concat "" (List.map stringOfAContent aContent)
sprintf "<a>%s</a>" content
| Span_Content.Br -> "<br />"
| Span_Content.Img -> "<img />"
| Span_Content.Span spanContent -> stringOfSpan (Span.Span spanContent)
and stringOfSpan (span : Span) =
match span with
| Span.Span spanContent ->
let content = String.concat "" (List.map stringOfSpanContent spanContent)
sprintf "<span>%s</span>" content
let test span = printfn "span: %s\n" (stringOfSpan span)
Here's the output:
span: <span>hello</span>
span: <span><br /><span><br /><a><span>Click me</span></a>huh?<img /></span></span>
Error messages seem reasonable...
test ( div "hello" )
Error: The type 'Span' does not support any operators named 'MakeDiv'
Because the Make functions and the other functions are inline, the generated IL is probably similar to what you would code by hand if you were implementing this without the added type safety.
You could use the same approach to handle attributes.
I do wonder if it will degrade at the seams, as Brian pointed out contortionist solutions might. (Does this count as contortionist or not?) Or if it will melt the compiler or the developer down by the time it implements all of XHTML.

Classes?
type Letter (c) =
member this.Character = c
override this.ToString () = sprintf "letter '%c'" c
type Vowel (c) = inherit Letter (c)
type Consonant (c) = inherit Letter (c)
let printLetter (letter : Letter) =
printfn "The letter is %c" letter.Character
let printVowel (vowel : Vowel) =
printfn "The vowel is %c" vowel.Character
let _ =
let a = Vowel('a')
let b = Consonant('b')
let x = Letter('x')
printLetter a
printLetter b
printLetter x
printVowel a
// printVowel b // Won't compile
let l : Letter list = [a; b; x]
printfn "The list is %A" l

Thanks for all the suggestions! Just in case it will inspire anyone to come up with a solution to the problem: below is a simple HTML page written in our DSL Wing Beats. The span is a child of the body. This is not valid HTML. It would be nice if it didn't compile.
let page =
e.Html [
e.Head [ e.Title & "A little page" ]
e.Body [
e.Span & "I'm not allowed here! Because I'm not a block element."
]
]
Or are there other ways to check it, that we have not thought about? We're pragmatic! Every possible way is worth investigating. One of the major goals with Wing Beats is to make it act like an (X)Html expert system, that guides the programmer. We want to be sure a programmer only produces invalid (X)Html if he chooses to, not because of lacking knowledge or careless mistakes.
We think we have a solution for statically checking the attributes. It looks like this:
module a =
type ImgAttributes = { Src : string; Alt : string; (* and so forth *) }
let Img = { Src = ""; Alt = ""; (* and so forth *) }
let link = e.Img { a.Img with Src = "image.jpg"; Alt = "An example image" };
It has its pros and cons, but it should work.
Well, if anyone comes up with anything, let us know!

Related

Unable to define a parser in Haskell: Not in scope: type variable ‘a’

I am trying to define a parser in Haskell. I am a total beginner and somehow didn't manage to find any solution to my problem at all.
For the first steps I tried to follow the instructions on the slides of a powerpoint presentation. But I constantly get the error "Not in scope: type variable ‘a’":
type Parser b = a -> [(b,a)]
item :: Parser Char
item = \inp -> case inp of
[] -> []
(x:xs) -> [(x:xs)]
error: Not in scope: type variable ‘a’
|
11 | type Parser b = a -> [(b,a)]
| ^
I don't understand the error but moreover I don't understand the first line of the code as well:
type Parser b = a -> [(b,a)]
What is this supposed to do? On the slide it just tells me that in Haskell, Parsers can be defined as functions. But that doesn't look like a function definition to me. What is "type" doing here? If it s used to specify the type, why not use "::" like in second line above? And "Parser" seems to be a data type (because we can use it in the type definition of "item"). But that doesn't make sense either.
The line derives from:
type Parser = String -> (String, Tree)
The line I used in my code snippet above is supposed to be a generalization of that.
Your help would be much appreciated. And please bear in mind that I hardly know anything about Haskell, when you write an answer :D
There is a significant difference between the type alias type T = SomeType and the type annotation t :: SomeType.
type T = Int simply states that T is just another name for the type Int. From now on, every time we use T, it will be replaced with Int by the compiler.
By contrast, t :: Int indicates that t is some value of type Int. The exact value is to be specified by an equation like t = 42.
These two concepts are very different. On one hand we have equations like T = Int and t = 42, and we can replace either side with the other side, replacing type with types and values with values. On the other hand, the annotation t :: Int states that a value has a given type, not that the value and the type are the same thing (which is nonsensical: 42 and Int have a completely different nature, a value and a type).
type Parser = String -> (String, Tree)
This correctly defines a type alias. We can make it parametric by adding a parameter:
type Parser a = String -> (String, a)
In doing so, we can not use variables in the right hand side that are not parameters, for the same reason we can not allow code like
f x = x + y -- error: y is not in scope
Hence you need to use the above Parser type, or some variation like
type Parser a = String -> [(String, a)]
By contrast, writing
type Parser a = b -> [(b, a)] -- error
would use an undeclared type b, and is an error. At best, we could have
type Parser a b = b -> [(b, a)]
which compiles. I wonder, though, is you really need to make the String type even more general than it is.
So, going back to the previous case, a possible way to make your code run is:
type Parser a = String -> [(a, String)]
item :: Parser Char
item = \inp -> case inp of
[] -> []
(x:xs) -> [(x, xs)]
Note how [(x, xs)] is indeed of type [(Char, String)], as needed.
If you really want to generalize String as well, you need to write:
type Parser a b = b -> [(b, a)]
item :: Parser Char String
item = \inp -> case inp of
[] -> []
(x:xs) -> [(xs, x)]

How to serialize function type to json in haskell?

data Task = Task
{ id :: String
, description :: String
, dependsOn :: [String]
, dependentTasks :: [String]
} deriving (Eq, Show, Generic, ToJSON, FromJSON)
type Storage = Map String Task
s :: Storage
s = empty
addTask :: Task -> Storage -> Storage
addTask (Task id desc dep dept) = insert id (Task id desc dep dept)
removeTask :: String -> Storage -> Storage
removeTask tid = delete tid
changes = [addTask (Task "1" "Description" [] []), removeTask "1"]
main = putStrLn . show $ foldl (\s c -> c s) s changes
Suppose I have the following code. I want to store changes list in a json file. But I don't know how to do that with Aeson, aside probably from writing a custom parser and there must be a better way to do that obviously. Like maybe using language extension to derive (Generic, ToJSON, FromJSON) for addTask and removeTask etc...
EDIT. For all people that say "You can't serialize function".
Read the comments to an answer to this question.
Instance Show for function
That said, it's not possible to define Show to actually give you more
? detail about the function. – Louis Wasserman May 12 '12 at 14:51
Sure it is. It can show the type (given via Typeable); or it can show some of the inputs and outputs (as is done in QuickCheck).
EDIT2. Okay, I got that I can't have function name in serialization. But can this be done via template Haskell? I see that aeson supports serialization via template Haskell, but as newcomer to Haskell can't figure out how to do that.
Reading between the lines a bit, a recurring question here is, "Why can't I serialize a function (easily)?" The answer -- which several people have mentioned, but not explained clearly -- is that Haskell is dedicated to referential transparency. Referential transparency says that you can replace a definition with its defined value (and vice versa) without changing the meaning of the program.
So now, let's suppose we had a hypothetical serializeFunction, which in the presence of this code:
foo x y = x + y + 3
Would have this behavior:
> serializeFunction (foo 5)
"foo 5"
I guess you wouldn't object too strenuously if I also claimed that in the presence of
bar x y = x + y + 3
we would "want" this behavior:
> serializeFunction (bar 5)
"bar 5"
And now we have a problem, because by referential transparency
serializeFunction (foo 5)
= { definition of foo }
serializeFunction (\y -> 5 + y + 3)
= { definition of bar }
serializeFunction (bar 5)
but "foo 5" does not equal "bar 5".
The obvious followup question is: why do we demand referential transparency? There are at least two good reasons: first, it allows equational reasoning like above, hence eases the burden of refactoring; and second, it reduces the amount of runtime information that's needed, hence improving performance.
Of course, if you can come up with a representation of functions that respects referential transparency, that poses no problems. Here are some ideas in that direction:
printing the type of the function
instance (Typeable a, Typeable b) => Show (a -> b) where
show = show . typeOf
-- can only write a Read instance for trivial functions
printing the input-output behavior of the function (which can also be read back in)
creating a data type that combines a function with its name, and then printing that name
data Named a = Named String a
instance Show (Named a) where
show (Named n _) = n
-- perhaps you could write an instance Read (Map String a -> Named a)
(and see also cloud haskell for a more complete working of this idea)
constructing an algebraic data type that can represent all the expressions you care about but contains only basic types that already have a Show instance and serializing that (e.g. as described in the other answer)
But printing a bare function's name is in conflict with referential transparency.
Make a data type for your functions and an evaluation function:
data TaskFunction = AddTask Task | RemoveTask String
deriving (Eq, Show, Generic, ToJSON, FromJSON)
eval :: TaskFunction -> Storage -> Storage
eval (AddTask t) = addTask t
eval (RemoveTask t) = removeTask t
changes = [AddTask (Task "1" "Description" [] []), RemoveTask "1"]
main = putStrLn . show $ foldl (\s c -> c s) s (eval <$> changes)

Difference between let, fun and function in F#

I'm learning F# and I cannot figure out what the difference between let, fun and function is, and my text book doesn't really explain that either. As an example:
let s sym = function
| V x -> Map.containsKey x sym
| A(f, es) -> Map.containsKey f sym && List.forall (s sym) es;;
Couldn't I have written this without the function keyword? Or could I have written that with fun instead of function? And why do I have to write let when I've seen some examples where you write
fun s x =
...
What's the difference really?
I guess you should really ask MSDN, but in a nutshell:
let binds a value with a symbol. The value can be a plain type like an int or a string, but it can also be a function. In FP functions are values and can be treated in the same way as those types.
fun is a keyword that introduces an anonymous function - think lambda expression if you're familiar with C#.
Those are the two important ones, in the sense that all the others usages you've seen can be thought as syntax sugar for those two. So to define a function, you can say something like this:
let myFunction =
fun firstArg secondArg ->
someOperation firstArg secondArg
And that's very clear way of saying it. You declare that you have a function and then bind it to the myFunction symbol.
But you can save yourself some typing by just conflating anonymous function declaration and binding it to a symbol with let:
let myFunction firstArg secondArg =
someOperation firstArg secondArg
What function does is a bit trickier - you combine an anonymous single-argument function declaration with a match expression, by matching on an implicit argument. So these two are equivalent:
let myFunction firstArg secondArg =
match secondArg with
| "foo" -> firstArg
| x -> x
let myFunction firstArg = function
| "foo" -> firstArg
| x -> x
If you're just starting on F#, I'd steer clear of that one. It has its uses (mainly for providing succinct higher order functions for maps/filters etc.), but results in code less readable at a glance.
These things are sort of shortcuts to each other.
The most fundamental thing is let. This keyword gives names to stuff:
let name = "stuff"
Speaking more technically, the let keyword defines an identifier and binds it to a value:
let identifier = "value"
After this, you can use words name and identifier in your program, and the compiler will know what they mean. Without the let, there wouldn't be a way to name stuff, and you'd have to always write all your stuff inline, instead of referring to chunks of it by name.
Now, values come in different flavors. There are strings "some string", there are integer numbers 42, floating point numbers 5.3, Boolean values true, and so on. One special kind of value is function. Functions are also values, in most respects similar to strings and numbers. But how do you write a function? To write a string, you use double quotes, but what about function?
Well, to write a function, you use the special word fun:
let squareFn = fun x -> x*x
Here, I used the let keyword to define an identifier squareFn, and bind that identifier to a value of the function kind. Now I can use the word squareFn in my program, and the compiler will know that whenever I use it I mean a function fun x -> x*x.
This syntax is technically sufficient, but not always convenient to write. So in order to make it shorter, the let binding takes an extra responsibility upon itself and provides a shorter way to write the above:
let squareFn x = x*x
That should do it for let vs fun.
Now, the function keyword is just a short form for fun + match. Writing function is equivalent to writing fun x -> match x with, period.
For example, the following three definitions are equivalent:
let f = fun x ->
match x with
| 0 -> "Zero"
| _ -> "Not zero"
let f x = // Using the extra convenient form of "let", as discussed above
match x with
| 0 -> "Zero"
| _ -> "Not zero"
let f = function // Using "function" instead of "fun" + "match"
| 0 -> "Zero"
| _ -> "Not zero"

partial deconstruction in pattern-matching (F#)

Following a minimal example of an observation (that kind of astonished me):
type Vector = V of float*float
// complete unfolding of type is OK
let projX (V (a,_)) = a
// also works
let projX' x =
match x with
| V (a, _) -> a
// BUT:
// partial unfolding is not Ok
let projX'' (V x) = fst x
// consequently also doesn't work
let projX''' x =
match x with
| V y -> fst y
What is the reason that makes it impossible to match against a partially deconstructed type?
Some partial deconstructions seem to be ok:
// Works
let f (x,y) = fst y
EDIT:
Ok, I now understand the "technical" reason of the behavior described (Thanks for your answers & comments). However, I think that language wise, this behavior feels a bit "unnatural" compared to rest of the language:
"Algebraically", to me, it seems strange to distinguish a type "t" from the type "(t)". Brackets (in this context) are used for giving precedence like e.g. in "(t * s) * r" vs "t * (s * r)". Also fsi answers accordingly, whether I send
type Vector = (int * int)
or
type Vector = int * int
to fsi, the answer is always
type Vector = int * int
Given those observations, one concludes that "int * int" and "(int * int)" denote exactly the same types and thus that all occurrences of one could in any piece of code be replaced with the other (ref. transparency)... which as we have seen is not true.
Further it seems significant that in order to explain the behavior at hand, we had to resort to talk about "how some code looks like after compilation" rather than about semantic properties of the language which imo indicates that there are some "tensions" between language semantics an what the compiler actually does.
In F#
type Vector = V of float*float
is just a degenerated union (you can see that by hovering it in Visual Studio), so it's equivalent to:
type Vector =
| V of float*float
The part after of creates two anonymous fields (as described in F# reference) and a constructor accepting two parameters of type float.
If you define
type Vector2 =
| V2 of (float*float)
there's only one anonymous field which is a tuple of floats and a constructor with a single parameter. As it was pointed out in the comment, you can use Vector2 to do desired pattern matching.
After all of that, it may seem illogical that following code works:
let argsTuple = (1., 1.)
let v1 = V argsTuple
However, if you take into account that there's a hidden pattern matching, everything should be clear.
EDIT:
F# language spec (p 122) states clearly that parenthesis matter in union definitions:
Parentheses are significant in union definitions. Thus, the following two definitions differ:
type CType = C of int * int
type CType = C of (int * int)
The lack of parentheses in the first example indicates that the union case takes two arguments. The parentheses
in the second example indicate that the union case takes one argument that is a first-class tuple value.
I think that such behavior is consistent with the fact that you can define more complex patterns at the definition of a union, e.g.:
type Move =
| M of (int * int) * (int * int)
Being able to use union with multiple arguments also makes much sense, especially in interop situation, when using tuples is cumbersome.
The other thing that you used:
type Vector = int * int
is a type abbreviation which simply gives a name to a certain type. Placing parenthesis around int * int does not make a difference because those parenthesis will be treated as grouping parenthesis.

F# Inline Function Specialization

My current project involves lexing and parsing script code, and as such I'm using fslex and fsyacc. Fslex LexBuffers can come in either LexBuffer<char> and LexBuffer<byte> varieties, and I'd like to have the option to use both.
In order to user both, I need a lexeme function of type ^buf -> string. Thus far, my attempts at specialization have looked like:
let inline lexeme (lexbuf: ^buf) : ^buf -> string where ^buf : (member Lexeme: char array) =
new System.String(lexbuf.Lexeme)
let inline lexeme (lexbuf: ^buf) : ^buf -> string where ^buf : (member Lexeme: byte array) =
System.Text.Encoding.UTF8.GetString(lexbuf.Lexeme)
I'm getting a type error stating that the function body should be of type ^buf -> string, but the inferred type is just string. Clearly, I'm doing something (majorly?) wrong.
Is what I'm attempting even possible in F#? If so, can someone point me to the proper path?
Thanks!
Functions and members marked as inline cannot be overloaded, so your original strategy won't work. You need to write different code for both of the declarations, so you need to use overloading (if you want to write this without boxing and dynamic type tests).
If you're using standard F# tools, then the type you'll get as a buffer will always be LexBuffer<'T> and you want to have two overloads based on the type argument. In this case, you don't need the static member constraints at all and can write just:
type Utils =
static member lexeme(buf:LexBuffer<char>) =
new System.String(buf.Lexeme)
static member lexeme(buf:LexBuffer<byte>) =
System.Text.Encoding.UTF8.GetString(buf.Lexeme)
Are you sure this strategy of redefining inline functions with different argument types can work? Looks like you're trying to overload to me...
type LexBuffer<'a>(data : 'a []) =
member this.Lexeme = data
let lexeme (buf : LexBuffer<'a>) =
match box buf.Lexeme with
| :? (char array) as chArr ->
new System.String(chArr)
| :? (byte array) as byArr ->
System.Text.Encoding.UTF8.GetString(byArr)
| _ -> invalidArg "buf" "must be either char or byte LexBuffer"
new LexBuffer<byte>([| 97uy; 98uy; 99uy |])
|> lexeme
|> printfn "%A"
new LexBuffer<char>([| 'a'; 'b'; 'c' |])
|> lexeme
|> printfn "%A"