Python's json.dumps (or) loads in haskell-aeson? - json

In Aeson library meant for object serializing/deserializing, I see the functions, FromJSON & ToJSON declared as instances. The code is,
data Coord = Coord { x :: Double, y :: Double }
deriving (Show)
instance ToJSON Coord where
toJSON (Coord xV yV) = object [ "x" .= xV,
"y" .= yV ]
My questions are,
Why does the author create ToJSON/FromJSON instances with just one method? Can't toJSON/parseJSON be written as a function on its own?
In Python, one just does json.loads/json.dumps to handle any kind of object/json-string. Why does the haskell user need to write all these extra code for every object that he seralizes?
For composite objects with multiple hierarchies like
{"a":
{"b":
{
"c":1
}
}
}
, do we need to create multiple data and instance at each level?

Why does the author create ToJSON/FromJSON instances with just one method? Can't toJSON/parseJSON be written as a function on its own?
You misunderstand a lot of things, so let me clear this up a little. ToJSON and FromJSON aren't functions. These are typeclasses. Typeclasses are a way to write polymorphic code in Haskell.
Here I would explain a very simplified and incomplete definition of json serialization. First of all we declare a typeclass:
class ToJSON a where
toJSON :: a -> String
This statement basically says: "If a is an instance of typeclass ToJSON, then we can use function toJSON to serialize a into a JSON string".
When a typeclass is defined, one can implement instances of it for a variety of simple types:
instance ToJSON String where
toJSON s = s
instance ToJSON Int where
toJSON n = show n
instance ToJSON Double where
toJSON n = show n
After you defined these simple implementation, you can apply toJSON to values of either String, Int or Double and it would get dispatched to a right implementation:
toJSON "hello" -----> "hello"
toJSON (5 :: Int) -----> "5"
toJSON (5.5 :: Double) -----> "5.5"
To go further we need a way to encode JSON collections. Let's start with lists. We want to express that if there is a value a that can be serialized into JSON, then a list of such values can also be serialized into JSON.
-- ,-- value 'a' can be serialized into JSON
-- ,--------,
instance (ToJSON a) => ToJSON [a] where
-- ``````````-- A list of such values can also be serialized
-- | Here is how serialization can be performed
toJSON as = "[" ++ (intercalate ", " $ map toJSON as) ++ "]"
We serialize each value in the list, separate them with ", " and enclose in brackets. Note that recursive call to toJSON gets dispatched to the correct implementation.
Now we can use toJSON on lists:
toJSON [1,2,3,4] -----> "[1, 2, 3, 4]"
You can go further and try to implement the whole JSON syntax. Your next step here might be maps. I'll leave it as an exercise.
My point was to explain that when you write instance ToJSON Coord ... you simply provide a way to serialize Coord into JSON. And this gives you an ability to serialize lists of Coords, maps with Coords and many other things. This wouldn't be possible without typeclasses.
In Python, one just does json.loads/json.dumps to handle any kind of object/json-string. Why does the haskell user need to write all these extra code for every object that he seralizes?
An important point is that Python's json.loads wouldn't deserialize json into your object. It would deserialize it into a built in structure that might be equivalent to your object. You can do the same thing in Haskell by using template haskell which would declare ToJSON/FromJSON instances for you. Alternatively you can just dump the JSON into a key value Map and operate on it.
However, writing that extra code (or automatically generating it) gives you a lot of benefits which can be summarized with words "type safety".
For composite objects with multiple hierarchies like ..., do we need to create multiple data and instance at each level?
No you don't. In case of a structure that you linked the instances that would transform a number into such a structure or vice-versa would look approximately like this:
-- | Just a wrapper for the number which must be stored in a nested structure
newtype NestedStructure = NestedStructure Int
instance ToJSON NestedStructure where
toJSON (NestedStructure n) =
object ["a" .= object ["b" .= object ["c" .= n]]]
instance FromJSON NestedStructure where
fromJSON (Object o) = NestedStructure <$> ((o .: "a") >>= (.: "b")
>>= (.: "c"))
fromJSON _ = mzero

In Python, one just does json.loads/json.dumps to handle any kind of
object/json-string. Why does the haskell user need to write all these
extra code for every object that he seralizes?
One way to avoid this is to use the deriving mechanism in combination with GHC.Generics.
The "deriving" mechanism automatically generates typeclass instances for you, avoiding boilerplate. For example:
{-# LANGUAGE DeriveGeneric #-}
import GHC.Generics
data VDPServer = VDPServer
{ vdpHost :: String
, vdpPort :: Int
, vdpLogin :: String
, vdpPassword :: String
, vdpDatabase :: String
}
deriving Generic
instance FromJSON VDPServer
instance ToJSON VDPServer
This is descrived in the Type Conversion section of the documentation.
You can customize how the instances are generated using the Options type:
aesonOptions :: Options
aesonOptions = defaultOptions
{ sumEncoding = ObjectWithSingleField
, fieldLabelModifier = tail
, omitNothingFields = True
}
instance FromJSON VDPServer where
parseJSON = genericParseJSON aesonOptions
instance ToJSON VDPServer where
toJSON = genericToJSON aesonOptions
Sometimes, when dealing with complex preexisting JSON schemas, this approach doesn't work so well and one has to fall back to manually defining the parser. But for simpler cases it avoids you a lot of boilerplate.
do we need to create multiple data and instance at each level?
All of the record fields must have their own FromJSON/ToJSON instances. Many common types (tuples, lists, maps, strings...) already have such instances, see the instance list for FromJSON in the documentation. But if not, you'll have to define them (maybe using the Generic trick again).
In Python, one just does json.loads/json.dumps to handle any kind of
object/json-string. Why does the haskell user need to write all these
extra code for every object that he seralizes?
The Haskell equivalent of deserializing a JSON file into a composite of maps, lists, and privitive types is to read a Value object. This way one doesn't have to define a new record.

Related

How to implement toJSON for an assoc-list producing an object with key-values pairs generically (using Aeson)?

I have data which is a map. To make the question more concrete, let's think that it's represented as an assoc-list type D val = [(Key,val)] (or as type D val = Map Key val).
Key is an "enum" type -- a sum of nullary constructors, e.g.:
data Key = C1 | C2 | C3
There must be instance ToJSON for val.
I'd like to implement
instance ToJSON val => ToJSON (D val)
which would be like instance Map String val (producing an object with the corresponding key-value pairs) (an example, another example for HashMaps), only more generic.
More generic means that I'm not restricted to String or Text as the key (such instances for Map are found in the aeson package).
I want to learn how to use the generic mechanisms in aeson to do this concisely.
aeson already can convert "enums" to JSON strings, and also (C val1 val2 ...) to objects with a single key (the constructor name, like this I assume: { "C":[val1,val2,...] }) with genericToJSON defaultOptions{ sumEncoding = ObjectWithSingleField }. And another conversion which uses the constructor names is with the default options for data types with non-nullary constructors: then the constructor name becomes the value of a "tag" key. I could use similar code.
I think that even such an instance (for maps whose keys are from enums) would be useful addition to the library. But AFAIU neither aeson nor generic-aeson (which has adds some nicer conversion options) has such a conversion predefined.
A quick and short, but ugly solution is:
instance (ToJSON a) => ToJSON (D a)
where toJSON kvs = object [ t .= v | (k,v) <- kvs, let (String t) = toJSON k ]
Advantages:
it re-uses the generic mechanisms of aeson to get the key representation (as text);
it would re-use the same conversion rules for the names of the fields here as defined in the instance ToJSON Key, so we don't need to write them twice, or to care about consistency.
Disadvantages:
converting first to Object just to extract a String is simply ugly;
it introduces a dangerous non-total conversion (which even may fail if toJSON for Key is implemented in a non-default way;
there is no compile-time check that Key is an enum relying on types. (Actually, I'm not very keen on understanding the generics type machinery yet, so I see that such checks and choices can be done because aeson is implemented this way, but I wouldn't be able to write this myself yet.)

How can I use multiple 'Data' instance when one of them is 'deduced' from a type function?

I'm trying to make the following code work (well, compiling first!):
module Orexio.Radix where
import Data.Data
import Data.Map (Map)
import qualified Data.Map as Map
import Data.Typeable
import Text.JSON.Generic
class Resource a where
type Representation a :: *
identifier :: Resource a => Identifier a
class Endpoint a where
call :: Resource a => a -> Representation a
data Identifier a = Identifier [String] deriving (Show)
data Binding a = Binding (JSValue -> Either String JSValue)
bind :: (Data a, Resource a, Endpoint a, Data (Representation a)) => Binding a
bind = Binding (\x -> binding $ query x)
where binding query = fmap (\x -> toJSON $ call x) (resultToEither query)
query jsvalue = fromJSON jsvalue
{-- DEMO --}
data HelloWorld = HelloWorld {
name :: String
} deriving (Show, Typeable, Data)
instance Resource HelloWorld where
type Representation HelloWorld = String
identifier = Identifier ["helloworld"]
instance Endpoint HelloWorld where
call r = "Hello " ++ name r
So I had to enable FlexibleContexts to be able to do Data (Representation a), but still it's not working...
I have this error:
src/Orexio/Radix.hs:21:33:
Could not deduce (Data a0) arising from a use of `query'
from the context (Data a,
Resource a,
Endpoint a,
Data (Representation a))
bound by the type signature for
bind :: (Data a, Resource a, Endpoint a,
Data (Representation a)) =>
Binding a
at src/Orexio/Radix.hs:20:9-78
The type variable `a0' is ambiguous
Possible fix: add a type signature that fixes these type variable(s)
Note: there are several potential instances:
instance Data HelloWorld -- Defined at src/Orexio/Radix.hs:29:29
instance Data () -- Defined in `Data.Data'
instance (Data a, Data b) => Data (a, b) -- Defined in `Data.Data'
...plus 42 others
In the second argument of `($)', namely `query x'
In the expression: binding $ query x
In the first argument of `Binding', namely
`(\ x -> binding $ query x)'
Honestly I'm kinda of lost here, I must be missing something but what?
Here is the other extensions I have activated: DeriveDataTypeable, ExistentialQuantification, NoMonomorphismRestriction, TypeFamilies
Thanks in advance!
It looks like Binding a is supposed to be a function that transforms a value of type a to a value of type Representation a or an error message. Since the input and output are JSON-encoded, however, they both have type JSValue; their types don't mention a at all!
data Binding a = Binding (JSValue -> Either String JSValue)
There's no information to indicate what type those JSValues represent.
In the definition of bind, the compiler knows that the return type is Binding a, but there's no link between that type and the types of JSValues. In particular, the compiler cannot deduce that fromJSON should return an a and that toJSON should take a Representation a. To fix it, add explicit type signatures to binding and query.
One detail that sometimes confuses people is how to tell GHC about type variable scopes. This needs the ScopedTypeVariables extension. Add forall a. to the type signature of bind, and add forall. to the other type signatures in the body of bind so that the variable a is properly scoped.
The essence of the type error is this line: "The type variable `a0' is ambiguous".
(Disclaimer: I'm trying to avoid jargon in this answer.)
To understand what's going on here, I suggest floating the binding and query bindings to the top-level. They successfully typecheck if you then comment out the bind binding. GHC infers the following types.
*Orexio.Radix> :i query
query :: Data a => JSValue -> Result a
*Orexio.Radix> :i binding
binding ::
(Data (Representation a), Endpoint a, Resource a) =>
Result a -> Either String JSValue
Your error is essentially caused by the expression \x -> binding (query x) in the definition of bind. Notice that the type variable a appears only in the domain of binding and the range of query. Thus, when you compose them, two things happen.
The type variable is not determined; its "value" remains unknown.
The type variable is not reachable from the type of the
composition. For our informal purposes, that type is JSValue -> Either
String JSValue.
GHC raises the error because the type variable was not determined during the composition (ie 1) and it can never be determined in the future (a consequence of 2).
The general shape of this problem in Haskell is more commonly known as the "show-read problem"; search for "ambiguous" on Chapter 6 of Real World Haskell. (Some might also call it "too much polymorphism".)
As you and Sjoerd have determined (and the RWH chapter explains), you can fix this type error by ascribing a type to the result of query before applying binding. Without knowing your semantics, I assume that you intend this "hidden" type variable a to be the same as the argument to the Binding type constructor. So the following would work.
bind :: forall b.
(Data b, Resource b, Endpoint b, Data (Representation b)) => Binding b
bind = Binding (\x -> binding $ (query x :: Result b))
This ascription eliminates the type variable a by replacing it entirely with Result b. Note that unlike a it is acceptable for b to remain undetermined, since it's reachable in the top-level type; uses of bind may each determine b.
That first solution requires giving bind an explicit signature — which can sometimes be quite onerous. In this case, due to the monomorphism restriction, you probably already need that type signature. However, if bind took an argument (but still exhibited this ambiguous type variable error), you could still rely on type inference by using a solution like the following.
dummy_ResultType :: Binding a -> Result a
dummy_ResultType = error "dummy_ResultType"
bind () = result where
result = Binding (\x -> binding $ (query x `asTypeOf` dummy))
dummy = dummy_ResultType result
(If the use of error worries you, cf the Proxy type.)
Or trade some idiomaticness for directness:
withArgTypeOf :: f x -> g x -> f x
withArgTypeOf x _ = x
bind () = result where
result = Binding (\x -> binding (query x `withArgTypeOf` result))
Now inference works.
*Orexio.Radix> :i bind
bind ::
(Data (Representation a), Data a, Endpoint a, Resource a) =>
() -> Binding a
Rest assured that GHC quickly determines after typechecking that the definition is not actually recursive.
HTH.

Creating polymorphic functions in Haskell

A short search didn't help me to find the answer, so I started to doubt in its existance. The question is simple. I want to create a polymorphic function, something like this:
f :: String -> String
f s = show (length s)
f :: Int -> String
f i = show i
A function defined differently for different data types is meant. Is it possible? If yes, how?
There are two flavors of polymorphism in Haskell:
parameteric polymorphism; and
bounded polymorphism
The first is the most general -- a function is parametrically polymorphic if it behaves uniformly for all types, in at least one of its type parameters.
For example, the function length is polymorphic -- it returns the length of a list, no matter what type is stored in its list.
length :: [a] -> Int
The polymorphism is indicated by a lower case type variable.
Now, if you have custom behavior that you want to have for a certain set of types, then you have bounded polymorphism (also known as "ad hoc"). In Haskell we use type classes for this.
The class declares which function will be available across a set of types:
class FunnyShow a where
funnyshow :: a -> String
and then you can define instances for each type you care about:
instance FunnyShow Int where
funnyshow i = show (i+1)
and maybe:
instance FunnyShow [Char] where
funnyshow s = show (show s)
Here is how you can achieve something similar using type families.
Well if you have same return types then you can achieve the behaviour without using type families and just using type classes alone as suggested by Don.
But it is better to use type families when you want to support more complex adhoc polymorphism, like different return types for each instance.
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE TypeFamilies #-}
class F a where
type Out a :: *
f :: a -> Out a
instance F String where
type Out String = String
f = show . length
instance F Int where
type Out Int = String
f = show
instance F Float where
type Out Float = Float
f = id
In ghci
*Main> f (2::Int)
"2"
*Main> f "Hello"
"5"
*Main> f (2::Float)
2.0

Haskell JSON Rest API — how to generically serialize meta information?

I'm creating a JSON REST API (lots of caps there), and already have Data.Aeson.Generic working nicely. In the following, serializedString would be {"x":10, "y":10}
import qualified Data.Aeson.Generic as A
import Data.Data (Data, Typeable)
data Unit = Unit { x :: Int, y :: Int } deriving (Show, Eq, Data, Typeable)
example = do
let serializedByteString = A.encode (Unit 10 10)
I would like to have my api respond like this for successes:
{unit:{x:10, y:10}}
And this for failures
{error:"Didn't work!"}
I was thinking of making some kind of Response data type, with Response and Error constructors. It's easy to serialize Error, but response could have all kinds of different objects, and rather than send back {data:{...}} I'd like to do {unit:{...}}.
Is there a way to define my Response value constructor so that it works with anything deriving Data? Is there a way to know what the name of the value constructor is when I go to serialize my object? show knows it somehow. Thanks!
Doesn't work for the Error constructor yet (I think), but this works for unit
{-# LANGUAGE OverloadedStrings, DeriveDataTypeable #-}
module Types where
import Data.Data (Data, Typeable, typeOf)
import Data.Aeson (ToJSON(..), (.=), object)
import qualified Data.Aeson.Generic as AG
import qualified Data.Text as T
import Data.Char (toLower)
data Unit = Unit { x :: Int, y :: Int } deriving (Show, Data, Typeable)
data Message a = Message { obj :: a } |
Error String
deriving (Show, Data, Typeable)
instance (Data a, Typeable a) => ToJSON (Message a) where
toJSON m = object [T.pack typeName .= AG.toJSON (obj m)]
where o = obj m
typeName = map toLower $ show $ typeOf o
You have to use Data.Aeson.encode, rather than Data.Aeson.Generic.encode on the top-level message. The key concept I was missing was typeOf from Data.Data.Typeable. It gives you a string representing the data constructor. It will be interesting to try to go the other direction using FromJSON
Edit: to be clear, you can then call Data.Aeson.encode $ Message $ Unit 10 10 and get {unit:{x:10, y:10}}

Difference between method and function in Scala

I read Scala Functions (part of Another tour of Scala). In that post he stated:
Methods and functions are not the same thing
But he didn't explain anything about it. What was he trying to say?
Jim has got this pretty much covered in his blog post, but I'm posting a briefing here for reference.
First, let's see what the Scala Specification tell us. Chapter 3 (types) tell us about Function Types (3.2.9) and Method Types (3.3.1). Chapter 4 (basic declarations) speaks of Value Declaration and Definitions (4.1), Variable Declaration and Definitions (4.2) and Functions Declarations and Definitions (4.6). Chapter 6 (expressions) speaks of Anonymous Functions (6.23) and Method Values (6.7). Curiously, function values is spoken of one time on 3.2.9, and no where else.
A Function Type is (roughly) a type of the form (T1, ..., Tn) => U, which is a shorthand for the trait FunctionN in the standard library. Anonymous Functions and Method Values have function types, and function types can be used as part of value, variable and function declarations and definitions. In fact, it can be part of a method type.
A Method Type is a non-value type. That means there is no value - no object, no instance - with a method type. As mentioned above, a Method Value actually has a Function Type. A method type is a def declaration - everything about a def except its body.
Value Declarations and Definitions and Variable Declarations and Definitions are val and var declarations, including both type and value - which can be, respectively, Function Type and Anonymous Functions or Method Values. Note that, on the JVM, these (method values) are implemented with what Java calls "methods".
A Function Declaration is a def declaration, including type and body. The type part is the Method Type, and the body is an expression or a block. This is also implemented on the JVM with what Java calls "methods".
Finally, an Anonymous Function is an instance of a Function Type (ie, an instance of the trait FunctionN), and a Method Value is the same thing! The distinction is that a Method Value is created from methods, either by postfixing an underscore (m _ is a method value corresponding to the "function declaration" (def) m), or by a process called eta-expansion, which is like an automatic cast from method to function.
That is what the specs say, so let me put this up-front: we do not use that terminology! It leads to too much confusion between so-called "function declaration", which is a part of the program (chapter 4 -- basic declarations) and "anonymous function", which is an expression, and "function type", which is, well a type -- a trait.
The terminology below, and used by experienced Scala programmers, makes one change from the terminology of the specification: instead of saying function declaration, we say method. Or even method declaration. Furthermore, we note that value declarations and variable declarations are also methods for practical purposes.
So, given the above change in terminology, here's a practical explanation of the distinction.
A function is an object that includes one of the FunctionX traits, such as Function0, Function1, Function2, etc. It might be including PartialFunction as well, which actually extends Function1.
Let's see the type signature for one of these traits:
trait Function2[-T1, -T2, +R] extends AnyRef
This trait has one abstract method (it has a few concrete methods as well):
def apply(v1: T1, v2: T2): R
And that tell us all that there is to know about it. A function has an apply method which receives N parameters of types T1, T2, ..., TN, and returns something of type R. It is contra-variant on the parameters it receives, and co-variant on the result.
That variance means that a Function1[Seq[T], String] is a subtype of Function1[List[T], AnyRef]. Being a subtype means it can be used in place of it. One can easily see that if I'm going to call f(List(1, 2, 3)) and expect an AnyRef back, either of the two types above would work.
Now, what is the similarity of a method and a function? Well, if f is a function and m is a method local to the scope, then both can be called like this:
val o1 = f(List(1, 2, 3))
val o2 = m(List(1, 2, 3))
These calls are actually different, because the first one is just a syntactic sugar. Scala expands it to:
val o1 = f.apply(List(1, 2, 3))
Which, of course, is a method call on object f. Functions also have other syntactic sugars to its advantage: function literals (two of them, actually) and (T1, T2) => R type signatures. For example:
val f = (l: List[Int]) => l mkString ""
val g: (AnyVal) => String = {
case i: Int => "Int"
case d: Double => "Double"
case o => "Other"
}
Another similarity between a method and a function is that the former can be easily converted into the latter:
val f = m _
Scala will expand that, assuming m type is (List[Int])AnyRef into (Scala 2.7):
val f = new AnyRef with Function1[List[Int], AnyRef] {
def apply(x$1: List[Int]) = this.m(x$1)
}
On Scala 2.8, it actually uses an AbstractFunction1 class to reduce class sizes.
Notice that one can't convert the other way around -- from a function to a method.
Methods, however, have one big advantage (well, two -- they can be slightly faster): they can receive type parameters. For instance, while f above can necessarily specify the type of List it receives (List[Int] in the example), m can parameterize it:
def m[T](l: List[T]): String = l mkString ""
I think this pretty much covers everything, but I'll be happy to complement this with answers to any questions that may remain.
One big practical difference between a method and a function is what return means. return only ever returns from a method. For example:
scala> val f = () => { return "test" }
<console>:4: error: return outside method definition
val f = () => { return "test" }
^
Returning from a function defined in a method does a non-local return:
scala> def f: String = {
| val g = () => { return "test" }
| g()
| "not this"
| }
f: String
scala> f
res4: String = test
Whereas returning from a local method only returns from that method.
scala> def f2: String = {
| def g(): String = { return "test" }
| g()
| "is this"
| }
f2: String
scala> f2
res5: String = is this
function A function can be invoked with a list of arguments to produce a
result. A function has a parameter list, a body, and a result type.
Functions that are members of a class, trait, or singleton object are
called methods. Functions defined inside other functions are called
local functions. Functions with the result type of Unit are called procedures.
Anonymous functions in source code are called function literals.
At run time, function literals are instantiated into objects called
function values.
Programming in Scala Second Edition.
Martin Odersky - Lex Spoon - Bill Venners
Let Say you have a List
scala> val x =List.range(10,20)
x: List[Int] = List(10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
Define a Method
scala> def m1(i:Int)=i+2
m1: (i: Int)Int
Define a Function
scala> (i:Int)=>i+2
res0: Int => Int = <function1>
scala> x.map((x)=>x+2)
res2: List[Int] = List(12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
Method Accepting Argument
scala> m1(2)
res3: Int = 4
Defining Function with val
scala> val p =(i:Int)=>i+2
p: Int => Int = <function1>
Argument to function is Optional
scala> p(2)
res4: Int = 4
scala> p
res5: Int => Int = <function1>
Argument to Method is Mandatory
scala> m1
<console>:9: error: missing arguments for method m1;
follow this method with `_' if you want to treat it as a partially applied function
Check the following Tutorial that explains passing other differences with examples like other example of diff with Method Vs Function, Using function as Variables, creating function that returned function
Functions don't support parameter defaults. Methods do. Converting from a method to a function loses parameter defaults. (Scala 2.8.1)
There is a nice article here from which most of my descriptions are taken.
Just a short comparison of Functions and Methods regarding my understanding. Hope it helps:
Functions:
They are basically an object. More precisely, functions are objects with an apply method; Therefore, they are a little bit slower than methods because of their overhead. It is similar to static methods in the sense that they are independent of an object to be invoked.
A simple example of a function is just like bellow:
val f1 = (x: Int) => x + x
f1(2) // 4
The line above is nothing except assigning one object to another like object1 = object2. Actually the object2 in our example is an anonymous function and the left side gets the type of an object because of that. Therefore, now f1 is an object(Function). The anonymous function is actually an instance of Function1[Int, Int] that means a function with 1 parameter of type Int and return value of type Int.
Calling f1 without the arguments will give us the signature of the anonymous function (Int => Int = )
Methods:
They are not objects but assigned to an instance of a class,i.e., an object. Exactly the same as method in java or member functions in c++ (as Raffi Khatchadourian pointed out in a comment to this question) and etc.
A simple example of a method is just like bellow:
def m1(x: Int) = x + x
m1(2) // 4
The line above is not a simple value assignment but a definition of a method. When you invoke this method with the value 2 like the second line, the x is substituted with 2 and the result will be calculated and you get 4 as an output. Here you will get an error if just simply write m1 because it is method and need the input value. By using _ you can assign a method to a function like bellow:
val f2 = m1 _ // Int => Int = <function1>
Here is a great post by Rob Norris which explains the difference, here is a TL;DR
Methods in Scala are not values, but functions are. You can construct a function that delegates to a method via η-expansion (triggered by the trailing underscore thingy).
with the following definition:
a method is something defined with def and a value is something you can assign to a val
In a nutshell (extract from the blog):
When we define a method we see that we cannot assign it to a val.
scala> def add1(n: Int): Int = n + 1
add1: (n: Int)Int
scala> val f = add1
<console>:8: error: missing arguments for method add1;
follow this method with `_' if you want to treat it as a partially applied function
val f = add1
Note also the type of add1, which doesn’t look normal; you can’t declare a variable of type (n: Int)Int. Methods are not values.
However, by adding the η-expansion postfix operator (η is pronounced “eta”), we can turn the method into a function value. Note the type of f.
scala> val f = add1 _
f: Int => Int = <function1>
scala> f(3)
res0: Int = 4
The effect of _ is to perform the equivalent of the following: we construct a Function1 instance that delegates to our method.
scala> val g = new Function1[Int, Int] { def apply(n: Int): Int = add1(n) }
g: Int => Int = <function1>
scala> g(3)
res18: Int = 4
Practically, a Scala programmer only needs to know the following three rules to use functions and methods properly:
Methods defined by def and function literals defined by => are functions. It is defined in page 143, Chapter 8 in the book of Programming in Scala, 4th edition.
Function values are objects that can be passed around as any values. Function literals and partially applied functions are function values.
You can leave off the underscore of a partially applied function if a function value is required at a point in the code. For example: someNumber.foreach(println)
After four editions of Programming in Scala, it is still an issue for people to differentiate the two important concepts: function and function value because all editions don't give a clear explanation. The language specification is too complicated. I found the above rules are simple and accurate.
In Scala 2.13, unlike functions, methods can take/return
type parameters (polymorphic methods)
implicit parameters
dependent types
However, these restrictions are lifted in dotty (Scala 3) by Polymorphic function types #4672, for example, dotty version 0.23.0-RC1 enables the following syntax
Type parameters
def fmet[T](x: List[T]) = x.map(e => (e, e))
val ffun = [T] => (x: List[T]) => x.map(e => (e, e))
Implicit parameters (context parameters)
def gmet[T](implicit num: Numeric[T]): T = num.zero
val gfun: [T] => Numeric[T] ?=> T = [T] => (using num: Numeric[T]) => num.zero
Dependent types
class A { class B }
def hmet(a: A): a.B = new a.B
val hfun: (a: A) => a.B = hmet
For more examples, see tests/run/polymorphic-functions.scala
The difference is subtle but substantial and it is related to the type system in use (besides the nomenclature coming from Object Oriented or Functional paradigm).
When we talk about a function, we talk about the type Function: it being a type, an instance of it can be passed around as input or output to other functions (at least in the case of Scala).
When we talk about a method (of a class), we are actually talking about the type represented by the class it is part of: that is, the method is just a component of a larger type, and cannot be passed around by itself. It must be passed around with the instance of the type it is part of (i.e. the instance of the class).
A method belongs to an object (usually the class, trait or object in which you define it), whereas a function is by itself a value, and because in Scala every value is an object, therefore, a function is an object.
For example, given a method and a function below:
def timesTwoMethod(x :Int): Int = x * 2
def timesTwoFunction = (x: Int) => x * 2
The second def is an object of type Int => Int (the syntactic sugar for Function1[Int, Int]).
Scala made functions objects so they could be used as first-class entities. This way you can pass functions to other functions as arguments.
However, Scala can also treat methods as functions via a mechanism called Eta Expansion.
For example, the higher-order function map defined on List, receives another function f: A => B as its only parameter. The next two lines are equivalent:
List(1, 2, 3).map(timesTwoMethod)
List(1, 2, 3).map(timesTwoFunction)
When the compiler sees a def given in a place where a function is needed, it automatically converts the method into an equivalent function.
A method operates on an object but a function doesn't.
Scala and C++ has Fuction but in JAVA, you have to imitate them with static methods.