Scala: val foo = (arg: Type) => {...} vs. def(arg:Type) = {...} - function

Related to this thread
I am still unclear on the distinction between these 2 definitions:
val foo = (arg: Type) => {...}
def(arg:Type) = {...}
As I understand it:
1) the val version is bound once, at compile time
a single Function1 instance is created
can be passed as a method parameter
2) the def version is bound anew on each call
new method instance created per call.
If the above is true, then why would one ever choose the def version in cases where the operation(s) to perform are not dependent on runtime state?
For example, in a servlet environment you might want to get the ip address of the connecting client; in this case you need to use a def as, of course there is no connected client at compile time.
On the other hand you often know, at compile time, the operations to perform, and can go with immutable val foo = (i: Type) => {...}
As a rule of thumb then, should one only use defs when there is a runtime state dependency?
Thanks for clarifying

I'm not entirely clear on what you mean by runtime state dependency. Both vals and defs can close over their lexical scope and are hence unlimited in this way. So what are the differences between methods (defs) and functions (as vals) in Scala (which has been asked and answered before)?
You can parameterize a def
For example:
object List {
def empty[A]: List[A] = Nil //type parameter alllowed here
val Empty: List[Nothing] = Nil //cannot create a type parameter
}
I can then call:
List.empty[Int]
But I would have to use:
List.Empty: List[Int]
But of course there are other reasons as well. Such as:
A def is a method at the JVM level
If I were to use the piece of code:
trades filter isEuropean
I could choose a declaration of isEuropean as either:
val isEuropean = (_ : Trade).country.region = Europe
Or
def isEuropean(t: Trade) = t.country.region = Europe
The latter avoids creating an object (for the function instance) at the point of declaration but not at the point of use. Scala is creating a function instance for the method declaration at the point of use. It is clearer if I had used the _ syntax.
However, in the following piece of code:
val b = isEuropean(t)
...if isEuropean is declared a def, no such object is being created and hence the code may be more performant (if used in very tight loops where every last nanosecond is of critical value)

Related

Overloading on reference to method of a particular instance in Kotlin

It's clear on how to reference a method of a particular instance: Reference to method of a particular instance in Kotlin
e.g.
val f = a::getItem
However what if getItem is overloaded? I cannot seem to find any material on that.
Let's assume the getItem has the following overloaded functions:
getItem (String) -> Item
getItem (String, Metrics) -> Item
How do I select any particular function by bound instance callable?
The context will determine which overload is chosen. In the case of
val f = a::getItem
The context does not say anything about what type a::getItem should be, so if getItem were overloaded, both overloads would be applicable, and there would be a compile-time error telling you exactly that. Something like:
Overload resolution ambiguity. All these functions match.
public fun getItem(name: String): Item defined in ...
public fun getItem(name: String, metrics: Metrics): Item defined in ...
If you instead give it some information about the type of f:
val f: (String) -> Item = a::getItem
Then it will pick the correct overload.

Python objects in dealloc in cython

In the docs it is written, that "Any C data that you explicitly allocated (e.g. via malloc) in your __cinit__() method should be freed in your __dealloc__() method."
This is not my case. I have following extension class:
cdef class SomeClass:
cdef dict data
cdef void * u_data
def __init__(self, data_len):
self.data = {'columns': []}
if data_len > 0:
self.data.update({'data': deque(maxlen=data_len)})
else:
self.data.update({'data': []})
self.u_data = <void *>self.data
#property
def data(self):
return self.data
#data.setter
def data(self, new_val: dict):
self.data = new_val
Some c function has an access to this class and it appends some data to SomeClass().data dict. What should I write in __dealloc__, when I want to delete the instance of the SomeClass()?
Maybe something like:
def __dealloc__(self):
self.data = None
free(self.u_data)
Or there is no need to dealloc anything at all?
No you don't need to and no you shouldn't. From the documentation
You need to be careful what you do in a __dealloc__() method. By the time your __dealloc__() method is called, the object may already have been partially destroyed and may not be in a valid state as far as Python is concerned, so you should avoid invoking any Python operations which might touch the object. In particular, don’t call any other methods of the object or do anything which might cause the object to be resurrected. It’s best if you stick to just deallocating C data.
You don’t need to worry about deallocating Python attributes of your object, because that will be done for you by Cython after your __dealloc__() method returns.
You can confirm this by inspecting the C code (you need to look at the full code, not just the annotated HTML). There's an autogenerated function __pyx_tp_dealloc_9someclass_SomeClass (name may vary slightly depending on what you called your module) does a range of things including:
__pyx_pw_9someclass_9SomeClass_3__dealloc__(o);
/* some other code */
Py_CLEAR(p->data);
where the function __pyx_pw_9someclass_9SomeClass_3__dealloc__ is (a wrapper for) your user-defined __dealloc__. Py_CLEAR will ensure that data is appropriately reference-counted then set to NULL.
It's a little hard to follow because it all goes through several layers of wrappers, but you can confirm that it does what the documentation says.

Advanced Parameterization Manual in Chisel

This is inside the chisel library
object Module {
// returns a new Module of type T, initialized with a Parameters instance if _p !=None.
def apply[T<:Module](c: =>T)(implicit _p: Option[Parameters] = None):T
}
I don't understand the =sign in the parameters. What does it represents?
The = in (implicit _p: Option[Parameters] = None) is assigning a default value of None to the parameter _p. That means that unless the otherwise specified there is no Parameter instance assigned to _p.
Just in case you are asking about the => in (c: =>T), the => is means that the first parameter c is a reference to a function that returns an instance of T, where T is a subclass of Module.
There's a bunch of idiomatic features of Scala being employed here: Function Currying, implicit parameters, Functions as first class citizens of the language. It's worth a taking a bit of time to learn the syntax of these things. Check out Chisel's generator-bootcamp tutorial particularly section 3.2 and 3.3 for some of the ways Chisel takes advantage of Scala's syntax
This example has two = signs. The first corresponds to By-name parameters: https://docs.scala-lang.org/tour/by-name-parameters.html.
The former is important because Modules in Chisel must wrapped in Module(...) when they are constructed. We generally accomplish using call by-name:
class MyModule extends Module {
...
}
// This works!
def func(mod: => MyModule) = {
val instance = Module(mod) // The module is constructed inside Module(...)
}
func(new MyModule)
// This doesn't work!
def func(mod: MyModule) = {
val instance = Module(mod)
}
func(new MyModule) // The module is constructed too early, here!
The second is a Default parameter: https://docs.scala-lang.org/tour/default-parameter-values.html. It's mainly a convenience thing:
def func(x: Int = 3) = { println(x) }
func(5) // prints 5
func() // prints 3

Difference between method and function in Scala

I read Scala Functions (part of Another tour of Scala). In that post he stated:
Methods and functions are not the same thing
But he didn't explain anything about it. What was he trying to say?
Jim has got this pretty much covered in his blog post, but I'm posting a briefing here for reference.
First, let's see what the Scala Specification tell us. Chapter 3 (types) tell us about Function Types (3.2.9) and Method Types (3.3.1). Chapter 4 (basic declarations) speaks of Value Declaration and Definitions (4.1), Variable Declaration and Definitions (4.2) and Functions Declarations and Definitions (4.6). Chapter 6 (expressions) speaks of Anonymous Functions (6.23) and Method Values (6.7). Curiously, function values is spoken of one time on 3.2.9, and no where else.
A Function Type is (roughly) a type of the form (T1, ..., Tn) => U, which is a shorthand for the trait FunctionN in the standard library. Anonymous Functions and Method Values have function types, and function types can be used as part of value, variable and function declarations and definitions. In fact, it can be part of a method type.
A Method Type is a non-value type. That means there is no value - no object, no instance - with a method type. As mentioned above, a Method Value actually has a Function Type. A method type is a def declaration - everything about a def except its body.
Value Declarations and Definitions and Variable Declarations and Definitions are val and var declarations, including both type and value - which can be, respectively, Function Type and Anonymous Functions or Method Values. Note that, on the JVM, these (method values) are implemented with what Java calls "methods".
A Function Declaration is a def declaration, including type and body. The type part is the Method Type, and the body is an expression or a block. This is also implemented on the JVM with what Java calls "methods".
Finally, an Anonymous Function is an instance of a Function Type (ie, an instance of the trait FunctionN), and a Method Value is the same thing! The distinction is that a Method Value is created from methods, either by postfixing an underscore (m _ is a method value corresponding to the "function declaration" (def) m), or by a process called eta-expansion, which is like an automatic cast from method to function.
That is what the specs say, so let me put this up-front: we do not use that terminology! It leads to too much confusion between so-called "function declaration", which is a part of the program (chapter 4 -- basic declarations) and "anonymous function", which is an expression, and "function type", which is, well a type -- a trait.
The terminology below, and used by experienced Scala programmers, makes one change from the terminology of the specification: instead of saying function declaration, we say method. Or even method declaration. Furthermore, we note that value declarations and variable declarations are also methods for practical purposes.
So, given the above change in terminology, here's a practical explanation of the distinction.
A function is an object that includes one of the FunctionX traits, such as Function0, Function1, Function2, etc. It might be including PartialFunction as well, which actually extends Function1.
Let's see the type signature for one of these traits:
trait Function2[-T1, -T2, +R] extends AnyRef
This trait has one abstract method (it has a few concrete methods as well):
def apply(v1: T1, v2: T2): R
And that tell us all that there is to know about it. A function has an apply method which receives N parameters of types T1, T2, ..., TN, and returns something of type R. It is contra-variant on the parameters it receives, and co-variant on the result.
That variance means that a Function1[Seq[T], String] is a subtype of Function1[List[T], AnyRef]. Being a subtype means it can be used in place of it. One can easily see that if I'm going to call f(List(1, 2, 3)) and expect an AnyRef back, either of the two types above would work.
Now, what is the similarity of a method and a function? Well, if f is a function and m is a method local to the scope, then both can be called like this:
val o1 = f(List(1, 2, 3))
val o2 = m(List(1, 2, 3))
These calls are actually different, because the first one is just a syntactic sugar. Scala expands it to:
val o1 = f.apply(List(1, 2, 3))
Which, of course, is a method call on object f. Functions also have other syntactic sugars to its advantage: function literals (two of them, actually) and (T1, T2) => R type signatures. For example:
val f = (l: List[Int]) => l mkString ""
val g: (AnyVal) => String = {
case i: Int => "Int"
case d: Double => "Double"
case o => "Other"
}
Another similarity between a method and a function is that the former can be easily converted into the latter:
val f = m _
Scala will expand that, assuming m type is (List[Int])AnyRef into (Scala 2.7):
val f = new AnyRef with Function1[List[Int], AnyRef] {
def apply(x$1: List[Int]) = this.m(x$1)
}
On Scala 2.8, it actually uses an AbstractFunction1 class to reduce class sizes.
Notice that one can't convert the other way around -- from a function to a method.
Methods, however, have one big advantage (well, two -- they can be slightly faster): they can receive type parameters. For instance, while f above can necessarily specify the type of List it receives (List[Int] in the example), m can parameterize it:
def m[T](l: List[T]): String = l mkString ""
I think this pretty much covers everything, but I'll be happy to complement this with answers to any questions that may remain.
One big practical difference between a method and a function is what return means. return only ever returns from a method. For example:
scala> val f = () => { return "test" }
<console>:4: error: return outside method definition
val f = () => { return "test" }
^
Returning from a function defined in a method does a non-local return:
scala> def f: String = {
| val g = () => { return "test" }
| g()
| "not this"
| }
f: String
scala> f
res4: String = test
Whereas returning from a local method only returns from that method.
scala> def f2: String = {
| def g(): String = { return "test" }
| g()
| "is this"
| }
f2: String
scala> f2
res5: String = is this
function A function can be invoked with a list of arguments to produce a
result. A function has a parameter list, a body, and a result type.
Functions that are members of a class, trait, or singleton object are
called methods. Functions defined inside other functions are called
local functions. Functions with the result type of Unit are called procedures.
Anonymous functions in source code are called function literals.
At run time, function literals are instantiated into objects called
function values.
Programming in Scala Second Edition.
Martin Odersky - Lex Spoon - Bill Venners
Let Say you have a List
scala> val x =List.range(10,20)
x: List[Int] = List(10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
Define a Method
scala> def m1(i:Int)=i+2
m1: (i: Int)Int
Define a Function
scala> (i:Int)=>i+2
res0: Int => Int = <function1>
scala> x.map((x)=>x+2)
res2: List[Int] = List(12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
Method Accepting Argument
scala> m1(2)
res3: Int = 4
Defining Function with val
scala> val p =(i:Int)=>i+2
p: Int => Int = <function1>
Argument to function is Optional
scala> p(2)
res4: Int = 4
scala> p
res5: Int => Int = <function1>
Argument to Method is Mandatory
scala> m1
<console>:9: error: missing arguments for method m1;
follow this method with `_' if you want to treat it as a partially applied function
Check the following Tutorial that explains passing other differences with examples like other example of diff with Method Vs Function, Using function as Variables, creating function that returned function
Functions don't support parameter defaults. Methods do. Converting from a method to a function loses parameter defaults. (Scala 2.8.1)
There is a nice article here from which most of my descriptions are taken.
Just a short comparison of Functions and Methods regarding my understanding. Hope it helps:
Functions:
They are basically an object. More precisely, functions are objects with an apply method; Therefore, they are a little bit slower than methods because of their overhead. It is similar to static methods in the sense that they are independent of an object to be invoked.
A simple example of a function is just like bellow:
val f1 = (x: Int) => x + x
f1(2) // 4
The line above is nothing except assigning one object to another like object1 = object2. Actually the object2 in our example is an anonymous function and the left side gets the type of an object because of that. Therefore, now f1 is an object(Function). The anonymous function is actually an instance of Function1[Int, Int] that means a function with 1 parameter of type Int and return value of type Int.
Calling f1 without the arguments will give us the signature of the anonymous function (Int => Int = )
Methods:
They are not objects but assigned to an instance of a class,i.e., an object. Exactly the same as method in java or member functions in c++ (as Raffi Khatchadourian pointed out in a comment to this question) and etc.
A simple example of a method is just like bellow:
def m1(x: Int) = x + x
m1(2) // 4
The line above is not a simple value assignment but a definition of a method. When you invoke this method with the value 2 like the second line, the x is substituted with 2 and the result will be calculated and you get 4 as an output. Here you will get an error if just simply write m1 because it is method and need the input value. By using _ you can assign a method to a function like bellow:
val f2 = m1 _ // Int => Int = <function1>
Here is a great post by Rob Norris which explains the difference, here is a TL;DR
Methods in Scala are not values, but functions are. You can construct a function that delegates to a method via η-expansion (triggered by the trailing underscore thingy).
with the following definition:
a method is something defined with def and a value is something you can assign to a val
In a nutshell (extract from the blog):
When we define a method we see that we cannot assign it to a val.
scala> def add1(n: Int): Int = n + 1
add1: (n: Int)Int
scala> val f = add1
<console>:8: error: missing arguments for method add1;
follow this method with `_' if you want to treat it as a partially applied function
val f = add1
Note also the type of add1, which doesn’t look normal; you can’t declare a variable of type (n: Int)Int. Methods are not values.
However, by adding the η-expansion postfix operator (η is pronounced “eta”), we can turn the method into a function value. Note the type of f.
scala> val f = add1 _
f: Int => Int = <function1>
scala> f(3)
res0: Int = 4
The effect of _ is to perform the equivalent of the following: we construct a Function1 instance that delegates to our method.
scala> val g = new Function1[Int, Int] { def apply(n: Int): Int = add1(n) }
g: Int => Int = <function1>
scala> g(3)
res18: Int = 4
Practically, a Scala programmer only needs to know the following three rules to use functions and methods properly:
Methods defined by def and function literals defined by => are functions. It is defined in page 143, Chapter 8 in the book of Programming in Scala, 4th edition.
Function values are objects that can be passed around as any values. Function literals and partially applied functions are function values.
You can leave off the underscore of a partially applied function if a function value is required at a point in the code. For example: someNumber.foreach(println)
After four editions of Programming in Scala, it is still an issue for people to differentiate the two important concepts: function and function value because all editions don't give a clear explanation. The language specification is too complicated. I found the above rules are simple and accurate.
In Scala 2.13, unlike functions, methods can take/return
type parameters (polymorphic methods)
implicit parameters
dependent types
However, these restrictions are lifted in dotty (Scala 3) by Polymorphic function types #4672, for example, dotty version 0.23.0-RC1 enables the following syntax
Type parameters
def fmet[T](x: List[T]) = x.map(e => (e, e))
val ffun = [T] => (x: List[T]) => x.map(e => (e, e))
Implicit parameters (context parameters)
def gmet[T](implicit num: Numeric[T]): T = num.zero
val gfun: [T] => Numeric[T] ?=> T = [T] => (using num: Numeric[T]) => num.zero
Dependent types
class A { class B }
def hmet(a: A): a.B = new a.B
val hfun: (a: A) => a.B = hmet
For more examples, see tests/run/polymorphic-functions.scala
The difference is subtle but substantial and it is related to the type system in use (besides the nomenclature coming from Object Oriented or Functional paradigm).
When we talk about a function, we talk about the type Function: it being a type, an instance of it can be passed around as input or output to other functions (at least in the case of Scala).
When we talk about a method (of a class), we are actually talking about the type represented by the class it is part of: that is, the method is just a component of a larger type, and cannot be passed around by itself. It must be passed around with the instance of the type it is part of (i.e. the instance of the class).
A method belongs to an object (usually the class, trait or object in which you define it), whereas a function is by itself a value, and because in Scala every value is an object, therefore, a function is an object.
For example, given a method and a function below:
def timesTwoMethod(x :Int): Int = x * 2
def timesTwoFunction = (x: Int) => x * 2
The second def is an object of type Int => Int (the syntactic sugar for Function1[Int, Int]).
Scala made functions objects so they could be used as first-class entities. This way you can pass functions to other functions as arguments.
However, Scala can also treat methods as functions via a mechanism called Eta Expansion.
For example, the higher-order function map defined on List, receives another function f: A => B as its only parameter. The next two lines are equivalent:
List(1, 2, 3).map(timesTwoMethod)
List(1, 2, 3).map(timesTwoFunction)
When the compiler sees a def given in a place where a function is needed, it automatically converts the method into an equivalent function.
A method operates on an object but a function doesn't.
Scala and C++ has Fuction but in JAVA, you have to imitate them with static methods.

What is Map/Reduce?

I hear a lot about map/reduce, especially in the context of Google's massively parallel compute system. What exactly is it?
From the abstract of Google's MapReduce research publication page:
MapReduce is a programming model and
an associated implementation for
processing and generating large data
sets. Users specify a map function
that processes a key/value pair to
generate a set of intermediate
key/value pairs, and a reduce function
that merges all intermediate values
associated with the same intermediate
key.
The advantage of MapReduce is that the processing can be performed in parallel on multiple processing nodes (multiple servers) so it is a system that can scale very well.
Since it's based from the functional programming model, the map and reduce steps each do not have any side-effects (the state and results from each subsection of a map process does not depend on another), so the data set being mapped and reduced can each be separated over multiple processing nodes.
Joel's Can Your Programming Language Do This? piece discusses how understanding functional programming was essential in Google to come up with MapReduce, which powers its search engine. It's a very good read if you're unfamiliar with functional programming and how it allows scalable code.
See also: Wikipedia: MapReduce
Related question: Please explain mapreduce simply
Map is a function that applies another function to all the items on a list, to produce another list with all the return values on it. (Another way of saying "apply f to x" is "call f, passing it x". So sometimes it sounds nicer to say "apply" instead of "call".)
This is how map is probably written in C# (it's called Select and is in the standard library):
public static IEnumerable<R> Select<T, R>(this IEnumerable<T> list, Func<T, R> func)
{
foreach (T item in list)
yield return func(item);
}
As you're a Java dude, and Joel Spolsky likes to tell GROSSLY UNFAIR LIES about how crappy Java is (actually, he's not lying, it is crappy, but I'm trying to win you over), here's my very rough attempt at a Java version (I have no Java compiler, and I vaguely remember Java version 1.1!):
// represents a function that takes one arg and returns a result
public interface IFunctor
{
object invoke(object arg);
}
public static object[] map(object[] list, IFunctor func)
{
object[] returnValues = new object[list.length];
for (int n = 0; n < list.length; n++)
returnValues[n] = func.invoke(list[n]);
return returnValues;
}
I'm sure this can be improved in a million ways. But it's the basic idea.
Reduce is a function that turns all the items on a list into a single value. To do this, it needs to be given another function func that turns two items into a single value. It would work by giving the first two items to func. Then the result of that along with the third item. Then the result of that with the fourth item, and so on until all the items have gone and we're left with one value.
In C# reduce is called Aggregate and is again in the standard library. I'll skip straight to a Java version:
// represents a function that takes two args and returns a result
public interface IBinaryFunctor
{
object invoke(object arg1, object arg2);
}
public static object reduce(object[] list, IBinaryFunctor func)
{
if (list.length == 0)
return null; // or throw something?
if (list.length == 1)
return list[0]; // just return the only item
object returnValue = func.invoke(list[0], list[1]);
for (int n = 1; n < list.length; n++)
returnValue = func.invoke(returnValue, list[n]);
return returnValue;
}
These Java versions need generics adding to them, but I don't know how to do that in Java. But you should be able to pass them anonymous inner classes to provide the functors:
string[] names = getLotsOfNames();
string commaSeparatedNames = (string)reduce(names,
new IBinaryFunctor {
public object invoke(object arg1, object arg2)
{ return ((string)arg1) + ", " + ((string)arg2); }
}
Hopefully generics would get rid of the casts. The typesafe equivalent in C# is:
string commaSeparatedNames = names.Aggregate((a, b) => a + ", " + b);
Why is this "cool"? Simple ways of breaking up larger calculations into smaller pieces, so they can be put back together in different ways, are always cool. The way Google applies this idea is to parallelization, because both map and reduce can be shared out over several computers.
But the key requirement is NOT that your language can treat functions as values. Any OO language can do that. The actual requirement for parallelization is that the little func functions you pass to map and reduce must not use or update any state. They must return a value that is dependent only on the argument(s) passed to them. Otherwise, the results will be completely screwed up when you try to run the whole thing in parallel.
After getting most frustrated with either very long waffley or very short vague blog posts I eventually discovered this very good rigorous concise paper.
Then I went ahead and made it more concise by translating into Scala, where I've provided the simplest case where a user simply just specifies the map and reduce parts of the application. In Hadoop/Spark, strictly speaking, a more complex model of programming is employed that require the user to explicitly specify 4 more functions outlined here: http://en.wikipedia.org/wiki/MapReduce#Dataflow
import scalaz.syntax.id._
trait MapReduceModel {
type MultiSet[T] = Iterable[T]
// `map` must be a pure function
def mapPhase[K1, K2, V1, V2](map: ((K1, V1)) => MultiSet[(K2, V2)])
(data: MultiSet[(K1, V1)]): MultiSet[(K2, V2)] =
data.flatMap(map)
def shufflePhase[K2, V2](mappedData: MultiSet[(K2, V2)]): Map[K2, MultiSet[V2]] =
mappedData.groupBy(_._1).mapValues(_.map(_._2))
// `reduce` must be a monoid
def reducePhase[K2, V2, V3](reduce: ((K2, MultiSet[V2])) => MultiSet[(K2, V3)])
(shuffledData: Map[K2, MultiSet[V2]]): MultiSet[V3] =
shuffledData.flatMap(reduce).map(_._2)
def mapReduce[K1, K2, V1, V2, V3](data: MultiSet[(K1, V1)])
(map: ((K1, V1)) => MultiSet[(K2, V2)])
(reduce: ((K2, MultiSet[V2])) => MultiSet[(K2, V3)]): MultiSet[V3] =
mapPhase(map)(data) |> shufflePhase |> reducePhase(reduce)
}
// Kinda how MapReduce works in Hadoop and Spark except `.par` would ensure 1 element gets a process/thread on a cluster
// Furthermore, the splitting here won't enforce any kind of balance and is quite unnecessary anyway as one would expect
// it to already be splitted on HDFS - i.e. the filename would constitute K1
// The shuffle phase will also be parallelized, and use the same partition as the map phase.
abstract class ParMapReduce(mapParNum: Int, reduceParNum: Int) extends MapReduceModel {
def split[T](splitNum: Int)(data: MultiSet[T]): Set[MultiSet[T]]
override def mapPhase[K1, K2, V1, V2](map: ((K1, V1)) => MultiSet[(K2, V2)])
(data: MultiSet[(K1, V1)]): MultiSet[(K2, V2)] = {
val groupedByKey = data.groupBy(_._1).map(_._2)
groupedByKey.flatMap(split(mapParNum / groupedByKey.size + 1))
.par.flatMap(_.map(map)).flatten.toList
}
override def reducePhase[K2, V2, V3](reduce: ((K2, MultiSet[V2])) => MultiSet[(K2, V3)])
(shuffledData: Map[K2, MultiSet[V2]]): MultiSet[V3] =
shuffledData.map(g => split(reduceParNum / shuffledData.size + 1)(g._2).map((g._1, _)))
.par.flatMap(_.map(reduce))
.flatten.map(_._2).toList
}
Map is a native JS method that can be applied to an array. It creates a new array as a result of some function mapped to every element in the original array. So if you mapped a function(element) { return element * 2;}, it would return a new array with every element doubled. The original array would go unmodified.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/map
Reduce is a native JS method that can also be applied to an array. It applies a function to an array and has an initial output value called an accumulator. It loops through each element in the array, applies a function, and reduces them to a single value (which begins as the accumulator). It is useful because you can have any output you want, you just have to start with that type of accumulator. So if I wanted to reduce something into an object, I would start with an accumulator {}.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/Reduce?v=a