Shortening function definitions using multine patterns in haskell - function

Is it better to do:
charToAction 'q' = Just $ WalkRight False
charToAction 'd' = Just $ WalkRight True
charToAction 'z' = Just Jump
charToAction _ = Nothing
or
charToAction x = case x of
'q' -> Just $ WalkRight False
'd' -> Just $ WalkRight True
'z' -> Just Jump
_ -> Nothing
?

There's absolutely no functional difference whatsoever. It's a matter of personal preference.

There is no performance difference because the first definition desugars to the second. You were right to prefer a case + pattern matching solution to one with guards and an equality test, some excellent general remarks are here: http://existentialtype.wordpress.com/2011/03/15/boolean-blindness/ (His examples are in ML but translate immediately to Haskell).
Note that you are misusing otherwise in the second definition; you should just write _ -> Nothing It isn't the otherwise of guards that you are using, you could as well have written fmap -> Nothing and the same would have happened.

As others mentioned, there is no difference in the generated code as the former approach desugars to the latter approach, however I can point out some other considerations that may help you choose one or the other to use:
The former approach looks prettier in some circumstances (although I personally think that once you have lots of cases it is distracting)
The former approach makes it slightly more difficult to refactor the name. Using the case statement the name of the function only appears once.
Case statements can be used anywhere in your code and can be used anonymously. The former approach requires using a let or where block to define pattern-matching functions within a function.
However, there are no cases where you can't translate one into the other. It's entirely a matter of coding style.

Related

Haskell function definition convention

I am beginner in Haskell .
The convention used in function definition as per my school material is actually as follows
function_name arguments_separated_by_spaces = code_to_do
ex :
f a b c = a * b +c
As a mathematics student I am habituated to use the functions like as follows
function_name(arguments_separated_by_commas) = code_to_do
ex :
f(a,b,c) = a * b + c
Its working in Haskell .
My doubt is whether it works in all cases ?
I mean can i use traditional mathematical convention in Haskell function definition also ?
If wrong , in which specific cases the convention goes wrong ?
Thanks in advance :)
Let's say you want to define a function that computes the square of the hypoteneuse of a right-triangle. Either of the following definitions are valid
hyp1 a b = a * a + b * b
hyp2(a,b) = a * a + b * b
However, they are not the same function! You can tell by looking at their types in GHCI
>> :type hyp1
hyp1 :: Num a => a -> a -> a
>> :type hyp2
hyp2 :: Num a => (a, a) -> a
Taking hyp2 first (and ignoring the Num a => part for now) the type tells you that the function takes a pair (a, a) and returns another a (e.g it might take a pair of integers and return another integer, or a pair of real numbers and return another real number). You use it like this
>> hyp2 (3,4)
25
Notice that the parentheses aren't optional here! They ensure that the argument is of the correct type, a pair of as. If you don't include them, you will get an error (which will probably look really confusing to you now, but rest assured that it will make sense when you've learned about type classes).
Now looking at hyp1 one way to read the type a -> a -> a is it takes two things of type a and returns something else of type a. You use it like this
>> hyp1 3 4
25
Now you will get an error if you do include parentheses!
So the first thing to notice is that the way you use the function has to match the way you defined it. If you define the function with parens, you have to use parens every time you call it. If you don't use parens when you define the function, you can't use them when you call it.
So it seems like there's no reason to prefer one over the other - it's just a matter of taste. But actually I think there is a good reason to prefer one over the other, and you should prefer the style without parentheses. There are three good reasons:
It looks cleaner and makes your code easier to read if you don't have parens cluttering up the page.
You will take a performance hit if you use parens everywhere, because you need to construct and deconstruct a pair every time you use the function (although the compiler may optimize this away - I'm not sure).
You want to get the benefits of currying, aka partially applied functions*.
The last point is a little subtle. Recall that I said that one way to understand a function of type a -> a -> a is that it takes two things of type a, and returns another a. But there's another way to read that type, which is a -> (a -> a). That means exactly the same thing, since the -> operator is right-associative in Haskell. The interpretation is that the function takes a single a, and returns a function of type a -> a. This allows you to just provide the first argument to the function, and apply the second argument later, for example
>> let f = hyp1 3
>> f 4
25
This is practically useful in a wide variety of situations. For example, the map functions lets you apply some function to every element of a list -
>> :type map
map :: (a -> b) -> [a] -> [b]
Say you have the function (++ "!") which adds a bang to any String. But you have lists of Strings and you'd like them all to end with a bang. No problem! You just partially apply the map function
>> let bang = map (++ "!")
Now bang is a function of type**
>> :type bang
bang :: [String] -> [String]
and you can use it like this
>> bang ["Ready", "Set", "Go"]
["Ready!", "Set!", "Go!"]
Pretty useful!
I hope I've convinced you that the convention used in your school's educational material has some pretty solid reasons for being used. As someone with a math background myself, I can see the appeal of using the more 'traditional' syntax but I hope that as you advance in your programming journey, you'll be able to see the advantages in changing to something that's initially a bit unfamiliar to you.
* Note for pedants - I know that currying and partial application are not exactly the same thing.
** Actually GHCI will tell you the type is bang :: [[Char]] -> [[Char]] but since String is a synonym for [Char] these mean the same thing.
f(a,b,c) = a * b + c
The key difference to understand is that the above function takes a triple and gives the result. What you are actually doing is pattern matching on a triple. The type of the above function is something like this:
(a, a, a) -> a
If you write functions like this:
f a b c = a * b + c
You get automatic curry in the function.
You can write things like this let b = f 3 2 and it will typecheck but the same thing will not work with your initial version. Also, things like currying can help a lot while composing various functions using (.) which again cannot be achieved with the former style unless you are trying to compose triples.
Mathematical notation is not consistent. If all functions were given arguments using (,), you would have to write (+)((*)(a,b),c) to pass a*b and c to function + - of course, a*b is worked out by passing a and b to function *.
It is possible to write everything in tupled form, but it is much harder to define composition. Whereas now you can specify a type a->b to cover for functions of any arity (therefore, you can define composition as a function of type (b->c)->(a->b)->(a->c)), it is much trickier to define functions of arbitrary arity using tuples (now a->b would only mean a function of one argument; you can no longer compose a function of many arguments with a function of many arguments). So, technically possible, but it would need a language feature to make it simple and convenient.

Monads - where are they necessary?

The other day I was talking about functional programming - especially Haskell with some Java/Scala guys and they asked me what are Monads and where are they necessary.
Well the definition and examples were not that hard - Maybe Monad, IO Monad, State Monad etc., so everyone was, at least partially, ok with me saying Monads are a good thing.
But where are Monads necessary - Maybe can be avoided via magic values like -1 in the setting of Integer, or "" in the setting of String. I have written a game without the State Monad, which is not nice at all but beginners do that.
So my question: Where are Monads necessary ? - and cannot be avoided at all.
(And no confusion - I like Monads and use them, I just want to know).
EDIT
I think I have to clarify that I do not think using "Magic Values" is a good solution, but a lot of programmers use them, especially in low level languages as C or in SHell scrips where an error is often implied by returning -1.
It was already clear to me that not using monads isn't a good idea. Abstraction is often very helpful, but also complicated to get, hence many a people struggle with the concept of monads.
The very core of my question was if it was possible to do for example IO, without a monad and still being pure and functional. I knew it would be tedious and painful to put a known good solution aside, as well as lighting a fire with flint and tinder instead of using a lighter.
The article #Antal S-Z refers to is great you could have invented monads, I skimmed over it, and will definitely read it when I have more time. The more revealing answer is hidden in the comment with the blog post referred to by #Antal S-Z i remember the time before monads, which was the stuff I was looking for when I asked the question.
I don't think you ever need monads. They're just a pattern that shows up naturally when you're working with certain kinds of function. The best explanation of this point of view that I've ever seen is Dan Piponi (sigfpe)'s excellent blog post "You Could Have Invented Monads! (And Maybe You Already Have.)", which this answer is inspired by.
You say you wrote a game without using the state monad. What did it look like? There's a good chance you ended up working with functions with types that looked something like openChest :: Player -> Location -> (Item,Player) (which opens a chest, maybe damages the player with a trap, and returns the found item). Once you need to combine those, you can either do so manually (let (item,player') = openChest player loc ; (x,player'') = func2 player' y in ...) or reimplement the state monad's >>= operator.
Or suppose that we're working in a language with hash maps/associative arrays, and we're not working with monads. We need to look up a few items and work with them; maybe we're trying to send a message between two users.
send username1 username2 = {
user1 = users[username1]
user2 = users[username2]
sendMessage user1 user2 messageBody
}
But wait, this won't work; username1 and username2 might be missing, in which case they'll be nil or -1 or something instead of the desired value. Or maybe looking up a key in an associative array returns a value of type Maybe a, so this will even be a type error. Instead, we've got to write something like
send username1 username2 = {
user1 = users[username1]
if (user1 == nil) return
user2 = users[username2]
if (user2 == nil) return
sendMessage user1 user2 messageBody
}
Or, using Maybe,
send username1 username2 =
case users[username1] of
Just user1 -> case users[username2] of
Just user2 -> Just $ sendMessage user1 user2 messageBody
Nothing -> Nothing
Nothing -> Nothing
Ick! This is messy and overly nested. So we define some sort of function which combines possibly-failing actions. Maybe something like
(>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b
f >>= Just x = f x
f >>= Nothing = Nothing
So you can write
send username1 username2 =
users[username1] >>= $ \user1 ->
users[username2] >>= $ \user2 ->
Just (sendMessage user1 user2 messageBody)
If you really didn't want to use Maybe, then you could implement
f >>= x = if x == nil then nil else f x
The same principle applies.
Really, though, I recommend reading "You Could Have Invented Monads!" It's where I got this intuition for monads, and explains it better and in more detail. Monads arise naturally when working with certain types. Sometimes you make that structure explicit and sometimes you don't, but just because you're refraining from it doesn't mean it's not there. You never need to use monads in the sense that you don't need to work with that specific structure, but often it's a natural thing to do. And recognizing the common pattern, here as in many other things, can allow you to write some nicely general code.
(Also, as the second example I used shows, note that you've thrown the baby out with the bathwater by replacing Maybe with magic values. Just because Maybe is a monad doesn't mean you have to use it like one; lists are also monads, as are functions (of the form r ->), but you don't propose getting rid of them! :-))
I could take the phrase "where is/are X necessary and unavoidable?" where X is anything at all in computing; what would be the point?
Instead, I think it's more valuable to ask, "what value do/does X provide?"".
And the most basic answer is that most X's in computing provide a useful abstraction that makes it easier, less tedious, and less error-prone to put code together.
Okay, but you don't need the abstraction, right? I mean, I could just type out a little code by hand that does the same thing, right? Yeah, of course, it's all just a bunch of 0's and 1's, so let's see who can write an XML parser faster, me using Java/Haskell/C or you with a Turing machine.
Re monads: since monads typically deal with effectful computations, this abstraction is most useful when composing effectful functions.
I take issue with your "magic values Maybe monad". That approach offers a very different abstraction to the programmer, and is less safe, more tedious, and more error prone to deal with than an actual Maybe monad. Also, reading such code, the programmer's intent would be less clear. In other words, it misses the whole point of real monads, which is to provide an abstraction.
I'd also like to note that monads are not fundamental to Haskell:
do-notation is simply syntactic sugar, and can be entirely replaced by >>= and >> without any loss of expressiveness
they (and their combinators, such as join, >>=, mapM, etc.) can be written in Haskell
they can be written in any language that supports higher-order functions, or even in Java using objects. So if you had to work with a Lisp that didn't have monads, you could implement them in that Lisp yourself without too much trouble
Because monad types return an answer of the same type, implementations of that monad type can enforce & preserve semantics. Then, in your code, you can chain operations with that type & let it enforce its rules, regardless of the type(s) it contained.
For example, the Optional class in Java 8 enforces the rule that the contained value is either present & non-null, or else not present. As long as you are using the Optional class, with or without using flatMap, you are wrapping that rule around the contained data type. No one can cheat or forget and add a value=null with present=true.
So declaring outside the code that -1 will be a sentinel value and mean such-and-such is fine, but you are still reliant on yourself and the other people working in the code to honor that semantic. If a new guy comes on board and starts using -1000000 to mean the same thing, then the semantics need to be enforced outside the code (perhaps with a lead pipe?) rather than through code mechanisms.
So rather than having to apply some rule consistently in your program, you can trust the monad to preserve that rule (or other semantics) -- over arbitrary types.
In this way, you can extend functionality of types by wrapping semantics around them, instead of, say, adding an "isPresent" to every type in your code base.
The presence of the numerous monadic utility types points to the fact that this mechanism of wrapping types with semantics is a pretty useful trick. If you have your own semantics that you'd like to add, you can do that by writing your own class using the monad pattern, and then inject strings or floats or ints or whatever into it.
But the short answer is that monads are a nice way to wrap common types in a fluent or chain-able container to add rules and usage without having to fuss with the implementation of the underlying types.

Erlang pattern matching with functions

As Erlang is an almost pure functional programming language, I'd imagine this was possible:
case X of
foo(Z) -> ...
end.
where foo(Z) is a decidable-invertible pure (side-effect free) bijective function, e.g.:
foo(input) -> output.
Then, in the case that X = output, Z would match as input.
Is it possible to use such semantics, with or without other syntax than my example, in Erlang?
No, what you want is not possible.
To do something like this you would need to be able to find the inverse of any bijective function, which is obviously undecidable.
I guess the reason why that is not allowed is that you want to guarantee the lack of side effects. Given the following structure:
case Expr of
Pattern1 [when GuardSeq1] ->
Body1;
...;
PatternN [when GuardSeqN] ->
BodyN
end
After you evaluate Expr, the patterns are sequentially matched against the result of Expr. Imagine your foo/1 function contains a side effect (e.g. it sends a message):
foo(input) ->
some_process ! some_msg,
output.
Even if the first pattern wouldn't match, you would have sent the message anyway and you couldn't recover from that situation.
No, Erlang only supports literal patterns!
And your original request is not an easy one. Just because there is a an inverse doesn't mean that it is easy to find. Practically it would that the compiler would have to make two versions of functions.
What you can do is:
Y = foo(Z),
case X of
Y -> ...
end.

Why do programming languages not allow spaces in identifiers?

This may seem like a dumb question, but still I don't know the answer.
Why do programming languages not allow spaces in the names ( for instance method names )?
I understand it is to facilitate ( allow ) the parsing, and at some point it would be impossible to parse anything if spaces were allowed.
Nowadays we are so use to it that the norm is not to see spaces.
For instance:
object.saveData( data );
object.save_data( data )
object.SaveData( data );
[object saveData:data];
etc.
Could be written as:
object.save data( data ) // looks ugly, but that's the "nature" way.
If it is only for parsing, I guess the identifier could be between . and ( of course, procedural languages wouldn't be able to use it because there is no '.' but OO do..
I wonder if parsing is the only reason, and if it is, how important it is ( I assume that it will be and it will be impossible to do it otherwise, unless all the programming language designers just... forget the option )
EDIT
I'm ok with identifiers in general ( as the fortran example ) is bad idea. Narrowing to OO languages and specifically to methods, I don't see ( I don't mean there is not ) a reason why it should be that way. After all the . and the first ( may be used.
And forget the saveData method , consider this one:
key.ToString().StartsWith("TextBox")
as:
key.to string().starts with("textbox");
Be cause i twoul d makepa rsing suc hcode reallydif ficult.
I used an implementation of ALGOL (c. 1978) which—extremely annoyingly—required quoting of what is now known as reserved words, and allowed spaces in identifiers:
"proc" filter = ("proc" ("int") "bool" p, "list" l) "list":
"if" l "is" "nil" "then" "nil"
"elif" p(hd(l)) "then" cons(hd(l), filter(p,tl(l)))
"else" filter(p, tl(l))
"fi";
Also, FORTRAN (the capitalized form means F77 or earlier), was more or less insensitive to spaces. So this could be written:
799 S = FLO AT F (I A+I B+I C) / 2 . 0
A R E A = SQ R T ( S *(S - F L O ATF(IA)) * (S - FLOATF(IB)) *
+ (S - F LOA TF (I C)))
which was syntactically identical to
799 S = FLOATF (IA + IB + IC) / 2.0
AREA = SQRT( S * (S - FLOATF(IA)) * (S - FLOATF(IB)) *
+ (S - FLOATF(IC)))
With that kind of history of abuse, why make parsing difficult for humans? Let alone complicate computer parsing.
Yes, it's the parsing - both human and computer. It's easier to read and easier to parse if you can safely assume that whitespace doesn't matter. Otherwise, you can have potentially ambiguous statements, statements where it's not clear how things go together, statements that are hard to read, etc.
Such a change would make for an ambiguous language in the best of cases. For example, in a C99-like language:
if not foo(int x) {
...
}
is that equivalent to:
A function definition of foo that returns a value of type ifnot:
ifnot foo(int x) {
...
}
A call to a function called notfoo with a variable named intx:
if notfoo(intx) {
...
}
A negated call to a function called foo (with C99's not which means !):
if not foo(intx) {
...
}
This is just a small sample of the ambiguities you might run into.
Update: I just noticed that obviously, in a C99-like language, the condition of an if statement would be enclosed in parentheses. Extra punctuation can help with ambiguities if you choose to ignore whitespace, but your language will end up having lots of extra punctuation wherever you would normally have used whitespace.
Before the interpreter or compiler can build a parse tree, it must perform lexical analysis, turning the stream of characters into a stream of tokens. Consider how you would want the following parsed:
a = 1.2423 / (4343.23 * 2332.2);
And how your rule above would work on it. Hard to know how to lexify it without understanding the meaning of the tokens. It would be really hard to build a parser that did lexification at the same time.
There are a few languages which allow spaces in identifiers. The fact that nearly all languages constrain the set of characters in identifiers is because parsing is more easy and most programmers are accustomed to the compact no-whitespace style.
I don’t think there’s real reason.
Check out Stroustrup's classic Generalizing Overloading for C++2000.
We were allowed to put spaces in filenames back in the 1960's, and computers still don't handle them very well (everything used to break, then most things, now it's just a few things - but they still break).
We simply can't wait another 50 years before our code will work again.
:-)
(And what everyone else said, of course. In English, we use spaces and punctuation to separate the words. The same is true for computer languages, except that computer parsers define "words" in a slightly different sense)
Using space as part of an identifier makes parsing really murky (is that a syntactic space or an identifier?), but the same sort "natural reading" behavior is achieved with keyword arguments. object.save(data: something, atomically: true)
The TikZ language for creating graphics in LaTeX allows whitespace in parameter names (also known as 'keys'). For instance, you see things like
\shade[
top color=yellow!70,
bottom color=red!70,
shading angle={45},
]
In this restricted setting of a comma-separated list of key-value pairs, there's no parsing difficulty. In fact, I think it's much easier to read than the alternatives like topColor, top_color or topcolor.

How do you write a (simple) variable "toggle"?

Given the following idioms:
1)
variable = value1
if condition
variable = value2
2)
variable = value2
if not condition
variable = value1
3)
if condition
variable = value2
else
variable = value1
4)
if not condition
variable = value1
else
variable = value2
Which do you prefer, and why?
We assume the most common execution path to be that of condition being false.
I tend to learn towards using 1), although I'm not exactly sure why I like it more.
Note: The following examples may be simpler—and thus possibly more readable—but not all languages provide such syntax, and they are not suitable for extending the variable assignment to include more than one statement in the future.
variable = condition ? value2 : value1
...
variable = value2 if condition else value1
In theory, I prefer #3 as it avoids having to assign a value to the variable twice. In the real world though I use any of the four above that would be more readable or would express more clearly my intention.
I prefer method 3 because it is more concise and a logical unit. It sets the value only once, it can be moved around as a block, and it's not that error-prone (which happens, esp. in method 1 if setting-to-value1 and checking-and-optionally-setting-to-value2 are separated by other statements)
3) is the clearest expression of what you want to happen. I think all the others require some extra thinking to determine which value is going to end up in the variable.
In practice, I would use the ternary operator (?:) if I was using a language that supported it. I prefer to write in functional or declarative style over imperative whenever I can.
I tend to use #1 alot myself. if condition reads easier than if !condition, especially if you acidentally miss the '!', atleast to my mind atleast.
Most coding I do is in C#, but I still tend to steer clear of the terniary operator, unless I'm working with (mostly) local variables. Lines tend to get long VERY quickly in a ternary operator if you're calling three layers deep into some structure, which quickly decreases the readability again.
Note: The following examples may be simpler—and thus possibly more readable—but not all languages provide such syntax
This is no argument for not using them in languages that do provide such a syntax. Incidentally, that includes all current mainstream languages after my last count.
and they are not suitable for extending the variable assignment to include more than one statement in the future.
This is true. However, it's often certain that such an extension will absolutely never take place because the condition will always yield one of two possible cases.
In such situations I will always prefer the expression variant over the statement variant because it reduces syntactic clutter and improves expressiveness. In other situations I tend to go with the switch statement mentioned before – if the language allows this usage. If not, fall-back to generic if.
switch statement also works. If it's simple and more than 2 or 3 options, that's what I use.
In a situation where the condition might not happen. I would go with 1 or 2. Otherwise its just based on what i want the code to do. (ie. i agree with cruizer)
I tend to use if not...return.
But that's if you are looking to return a variable. Getting disqualifiers out of the way first tends to make it more readable. It really depends on the context of the statement and also the language. A case statement might work better and be readable most of the time, but performance suffers under VB so a series of if/else statements makes more sense in that specific case.
Method 1 or method 3 for me. Method 1 can avoid an extra scope entrance/exit, but method 3 avoids an extra assignment. I'd tend to avoid Method 2 as I try to keep condition logic as simple as possible (in this case, the ! is extraneous as it could be rewritten as method 1 without it) and the same reason applies for method 4.
It depends on what the condition is I'm testing.
If it's an error flag condition then I'll use 1) setting the Error flag to catch the error and then if the condition is successfull clear the error flag. That way there's no chance of missing an error condition.
For everything else I'd use 3)
The NOT logic just adds to confusion when reading the code - well in my head, can't speak for eveyone else :-)
If the variable has a natural default value I would go with #1. If either value is equally (in)appropriate for a default then I would go with #2.
It depends. I like the ternary operators, but sometimes it's clearer if you use an 'if' statement. Which of the four alternatives you choose depends on the context, but I tend to go for whichever makes the code's function clearer, and that varies from situation to situation.