Currently, if we need to do reduce or forEach on iterable or iterator, would we just have to polyfill it? - ecmascript-6

First of all, does it'd make sense to have some array methods such as reduce or forEach also for iterables and iterators? Is it true that to use them and not wanting to blow up an array of a huge size, we just have to polyfill them for now?

Edit
What you propose is being worked on. There is a proposal at stage 2 of the TC39 process for adding a whole bunch of helper methods to the iterator prototype (so they'd be usable by collections) and the proposal includes the two you mention .forEach() and .reduce() along with a dozen others.
I don't yet fully understand how this is supposed to work because the spec talks about iterator helpers, but then shows using .reduce() on an actual Set instance directly, just like you would use it on an array. So, maybe the helpers are used by each class to implement their own method of that name. Since you typically want to reduce a collection, not reduce an iterator, that would make some sense. The iterator is just a tool used in the reduction of the collection, not the collection itself.
They redefine the .reduce() callback to only pass the accumulator and value (no index, no object). FYI, I discovered this by looking at the very end of https://node.green/. So, it is being worked on and since there is a proposed standard, you could polyfill it and you can find sample implementations for tons of proposed new iterator methods here.
Here's a polyfill for the proposed Set.prototype.reduce() and Map.prototype.reduce():
(function() {
if (!Set.prototype.reduce) {
Object.defineProperty(Set.prototype, "reduce", {value: reduce});
}
if (!Map.prototype.reduce) {
Object.defineProperty(Map.prototype, "reduce", {value: reduce});
}
function reduce(fn, initialValue) {
if (typeof fn !== "function") {
throw new TypeError("2nd argument to reduce must be function");
}
let noInitial = arguments.length < 2;
let accumulator = initialValue;
for (let [key, value] of this.entries()) {
// if no initial value, get it from the first value
if (noInitial) {
accumulator = value;
noInitial = false;
} else {
accumulator = fn(accumulator, key, value);
}
}
// if there was nothing to iterate and initialValue was not passed
// spec says this should be a TypeError
if (noInitial) {
throw new TypeError("iterable was empty and initalValue not passed")
}
return accumulator;
}
})();
// demo code
let s = new Set([1,2,3,4,5,6]);
let sum = s.reduce((total, val) => {
return total += val;
}, 0);
console.log(`Set Total = ${sum}`);
let m = new Map([['one',1],['two',2],['three',3],['four',4]]);
let sum2 = m.reduce((total, key, val) => {
return total += val;
}, 0);
console.log(`Map Total = ${sum2}`);
I have not quite figured out how a .reduce() method on a base Iterator class automatically makes it so that set.reduce() or map.reduce() will "just work". I'm not sure it does. I'm thinking that each class still has to wire up it's own .reduce() method, but it can use the helper implementation on the Iterator object to do so. Perhaps that's why they are called "helpers". They're just common functions that can be used to wire up your own top level method.
They can probably still be accessed on an iterator directly, but that doesn't seem how you would typically use them.
Original answer...
You do not really need forEach() because you can just use for/of on any iterable. So, if you really wanted forEach(), you would have to implement it yourself. I wouldn't call it a polyfill because there is no standard you're trying to fill-in for. As such, it would be better to make it a stand-alone function, not pollute the prototype in a non-standard way.
There are certainly some arguments for having a reduce() like function that works with an iterable if you're just trying to iterate and collect some single value from the iteration. Again, since there is no standard implementation for all iterables, you'd have to implement your own function that works with any iterable.
One problem with implementing reduce() for any arbitrary iterable is that Array.prototype.reduce() passes an index to the callback. This somewhat assumes that there is access by that index like an array has. But, some collections that have are an iterable do not have access by index. You could still create an index during the iteration and pass it to the callback as just a counter, but it could not necessarily be used the way the index is used when doing someArray.reduce().
Here's an implementation of reduce() that works on any iterable. For reference here's the spec for Array.prototype.reduce() which works off indexed access, not off an iterable which is why it can't be used directly on any iterable, but can be used on any Array-like object.
let s = new Set([1,2,3,4,5,6]);
function reduce(iterable, fn, initialValue) {
if (typeof fn !== "function") {
throw new TypeError("2nd argument to reduce must be function");
}
let initialValuePresent = arguments.length >= 3;
let accumulator = initialValue;
let cntr= 0;
for (let item of iterable) {
// if no initial value, get it from the first value
if (cntr === 0 && !initialValuePresent) {
accumulator = item;
} else {
accumulator = fn(accumulator, item, cntr, iterable);
}
++cntr;
}
// if there was nothing to iterate and initialValue was not passed
// spec says this should be a TypeError
if (cntr === 0 && !initialValuePresent) {
throw new TypeError("iterable was empty and initalValue not passed")
}
return accumulator;
}
let sum = reduce(s, (total, item, cntr, obj) => {
return total += item;
}, 0);
console.log(`Total = ${sum}`);

Related

Why is the optional return() method of es6 iterator interface take in an value argument. And why does it need to return an IteratorResult object?

This is the es6 Iterator interface expressed in TypeScript (copy from explore es6 by Axel Rauschmayer
interface Iterable {
[Symbol.iterator]() : Iterator;
}
interface Iterator {
next() : IteratorResult;
return?(value? : any) : IteratorResult;
}
interface IteratorResult {
value: any;
done: boolean;
}
Question1: Why is the optional return() method of es6 iterator interface take in one argument? (value? : any)
Question2: And why does it need to return an IteratorResult object?
To make sure we're on the same page, the .return method of an iterator generally behaves as a forced return at the currently-suspended location when called. That means no further code in the function itself will actually run. For a snippet like:
var makeIter = function*(){};
var iter = makeIter();
doing
var result = iter.return(4);
// { value: 4, done: true }
Can anyone think of a use case where the optional return method of an es6 iterator need to receive a value?
What else would it return? It could be hard-coded to return undefined, but it doesn't seem like there's much of a reason not to allow specifying a return value.
... return an IteratorResult?
What should it return if not that? While not handled by most callers of iterators, technically another yield could run when you call .return so it might not have actually finished executing the iterable yet.
Question2: why does it need to return an IteratorResult object?
function* gen(){
yield 123;
return 'ended value';
}
let iter = gen();
console.log(iter.next());//{value: 123, done: false}
console.log(iter.next());//{value: "ended value", done: true}
iter = gen();//restart iter;
console.log([...iter]);//[123] - "end value" is ignored
Even though it is ignored by the spread operator and for-of loop, javascript iteration interface allow the last IteratorResult to have a value ("end value" in this case). This mean the optional return() method of the interface must return an IteratorResult.
Question1: Why does return() need to take a value as an argument?
Es6 spec said
The returned IteratorResult object will typically have a done property whose value is true, and a value property with the value passed as the argument of the return method. However, this requirement is not enforced.

Can you bind a this value in a generator function

Given that you can't use arrow functions when you need to yield in its body, is it possible to set the this value for use in side the body.
I have made myself a database library which extends the "tedious" library that allows me to do something like the following
const self = this;
db.exec(function*(connection) {
let sql = 'SELECT * FROM myTable WHERE id = #id';
let request = connection.request(sql);
request.addParameter('id',db.TYPE.Int, myIdValue);
let count = yield connection.execSql(function*() {
let row = yield;
while(row) {
//process row with somthing like self.processRow(row);
row=yield;
}
});
if (count > 0) {
request = connection.request('some more sql');
//etc
}
return something;
}).then(something => {
//do some more things if the database access was a success
}).catch(error => {
// deal with any errors.
}) ;
I find I am increasingly needing to access the this value from the outside and am constantly doing the trick of assigning it to self at the head of the surrounding function.
Is it possible to set the this value with something like bind? inside the function* (at multiple levels down!)
Since I have full access to the iterators that I use to implement db.exec and connection.execSql I can change them if it's possible. to support it.
Generator use this as normal functions would.
You have few solutions:
use .bind on generator expression
pass this as first/second argument to generator named self
make db.exec take second argument thisArg, similar to array methods
If a thisArg parameter is provided to forEach(), it will be passed to callback when invoked, for use as its this value. Otherwise, the value undefined will be passed for use as its this value. The this value ultimately observable by callback is determined according to the usual rules for determining the this seen by a function.
I would suggest going with the last solution.

What is the use in having the valueOf() function?

Why is the valueOf() function present in everything in AS3? I can't think of an instance when this isn't redundant. In terms of getting a value, x and x.valueOf() are completely the same to me (except that one probably takes more CPU cycles). Furthermore even though they may not be the same in terms of setting something, x.valueOf() = y (if even legal) is just completely pointless.
I am confident though that this is here for a reason that I'm just not seeing. What is it? I did try Googling for a minute. Thanks!
As you say, its completely redundant.
The valueOf method is simply included so that ActionScript 3 complies with the ECMA language specification (obviously there are other requirements to be an ECMA language - i believe toString is another example).
Returns the primitive value of the specified object. If this object does not have a
primitive value, the object itself is returned.
Source: Adobe AS3 Reference http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/Object.html#valueOf()
Edit:
A primitive value can be a Number, int, bool, etc... They are just the value. An object can have properties, methods, etc.
Biggest difference, in my opinion though:
primitive2 = primitive1;
In this example, primitive 2 contains a copy of the data in primitive 1.
obj2 = obj1;
In this one, however, ob2 points to the same object as obj1. Modify either obj1 or obj2 and they both reflect the change, since they are references.
In short, valueOf is used when you want to see the primitive representation of an object (if one exists) rather than the object itself.
Here is a clear example between
Value Vs. ValueOf:
Value = Thu Jan 2 13:46:51 GMT-0800 2014 (value is date formatted)
ValueOf = 1388699211000 (valueOf is in Raw epoch)
valueOf isn't useless. It allows an Object to provide a value for an expression that expects a primitive type. It's available in AS3 as well as JavaScript.
If someone wrote a function that takes an int, you could pass it your object (more precisely, it passes the result of your object's valueOf() function).
The usefulness is tempered by 1) the fact that the Object isn't passed, so it's only an Object in the outermost scope, and 2) the fact that it's a read-only operation, no assignment can be made.
Here're a couple concrete examples off the top of my head:
Example 1: A Counter class that automatically increments its value every time it's read:
class Counter
{
private var _cnt:int = 0;
public function Counter() { }
public function valueOf():int
{
return _cnt++;
}
public function toString():String { return ""+valueOf(); }
}
Usage:
var c:* = new Counter();
trace(c); // 0
trace(c); // 1
trace(2*c+c); // 2*2+3 = 7
trace(c); // 4
Notes:
I added the toString() pass-through, since functions that take String prefer toString over valueOf.
You must type c as * and not Counter, otherwise you'll get a compiler error about implicit coercion of Counter to Number.
Example 2: A (read only) pointer type
Let's say you have an array of ints, and you want to have a reference (aka pointer) to an element in the array. ECMA scripts don't have pointers, but you can emulate one with valueOf():
class ArrayIntPointer
{
private var arr:Array;
private var idx:int;
public function ArrayIntPointer(arr:Array,
idx:int)
{
this.arr = arr;
this.idx = idx;
}
public function valueOf():int
{
return arr[idx];
}
public function toString():String { return ""+valueOf(); }
}
Usage:
var arr:Array = [1, 2, 3, 4, 5];
var int_ptr:* = new ArrayIntPointer(arr, 2);
// int_ptr is a pointer to the third item in the array and
// can be used in place of an int thanks to valueOf()
trace(int_ptr); // 3
var val:int = 2*int_ptr+1;
trace(val); // 7
// but it's still an object with references, so I
// can change the underlying Array, nand now my
// object's primitive (aka, non-Object types) value
// is 50, and it still can be used in place of an int.
arr[2] = 50;
trace(int_ptr); // 50
// you can assign int_ptr, but sadly, this doesn't
// affect the array.
That's pretty slick. It'd be really slick if you could assign the pointer and affect the array, but unfortunately that's not possible, as it assigns the int_ptr variable instead. That's why I call it a read-only pointer.

How to always return a java.util.Vector

If the value in my control only have one value the following code will return a String, if there are more than one value the code will return a java.util.Vector.
getComponent("mycontrol").getValue();
I want this code to return a vector even if there is only one value.
I have seen several code snippets that converts my string to an Array, but I want to get back a vector.
There is no way to force a singular value to be returned as a java.util.vector (or Array for that matter). The only way would be to test to see if it is a vector, then build a vector if not. You could place it into a function and wrap the call into that... for example (this is untested code so you'll need to verify syntax, etc):
asVector(getComponent("mycontrol").getValue());
function asVector(obj) {
if (obj.constructor === java.util.Vector) {
return obj;
} else {
var x:java.util.Vector = new java.util.Vector();
x.add(obj);
return x;
}
}

What is Map/Reduce?

I hear a lot about map/reduce, especially in the context of Google's massively parallel compute system. What exactly is it?
From the abstract of Google's MapReduce research publication page:
MapReduce is a programming model and
an associated implementation for
processing and generating large data
sets. Users specify a map function
that processes a key/value pair to
generate a set of intermediate
key/value pairs, and a reduce function
that merges all intermediate values
associated with the same intermediate
key.
The advantage of MapReduce is that the processing can be performed in parallel on multiple processing nodes (multiple servers) so it is a system that can scale very well.
Since it's based from the functional programming model, the map and reduce steps each do not have any side-effects (the state and results from each subsection of a map process does not depend on another), so the data set being mapped and reduced can each be separated over multiple processing nodes.
Joel's Can Your Programming Language Do This? piece discusses how understanding functional programming was essential in Google to come up with MapReduce, which powers its search engine. It's a very good read if you're unfamiliar with functional programming and how it allows scalable code.
See also: Wikipedia: MapReduce
Related question: Please explain mapreduce simply
Map is a function that applies another function to all the items on a list, to produce another list with all the return values on it. (Another way of saying "apply f to x" is "call f, passing it x". So sometimes it sounds nicer to say "apply" instead of "call".)
This is how map is probably written in C# (it's called Select and is in the standard library):
public static IEnumerable<R> Select<T, R>(this IEnumerable<T> list, Func<T, R> func)
{
foreach (T item in list)
yield return func(item);
}
As you're a Java dude, and Joel Spolsky likes to tell GROSSLY UNFAIR LIES about how crappy Java is (actually, he's not lying, it is crappy, but I'm trying to win you over), here's my very rough attempt at a Java version (I have no Java compiler, and I vaguely remember Java version 1.1!):
// represents a function that takes one arg and returns a result
public interface IFunctor
{
object invoke(object arg);
}
public static object[] map(object[] list, IFunctor func)
{
object[] returnValues = new object[list.length];
for (int n = 0; n < list.length; n++)
returnValues[n] = func.invoke(list[n]);
return returnValues;
}
I'm sure this can be improved in a million ways. But it's the basic idea.
Reduce is a function that turns all the items on a list into a single value. To do this, it needs to be given another function func that turns two items into a single value. It would work by giving the first two items to func. Then the result of that along with the third item. Then the result of that with the fourth item, and so on until all the items have gone and we're left with one value.
In C# reduce is called Aggregate and is again in the standard library. I'll skip straight to a Java version:
// represents a function that takes two args and returns a result
public interface IBinaryFunctor
{
object invoke(object arg1, object arg2);
}
public static object reduce(object[] list, IBinaryFunctor func)
{
if (list.length == 0)
return null; // or throw something?
if (list.length == 1)
return list[0]; // just return the only item
object returnValue = func.invoke(list[0], list[1]);
for (int n = 1; n < list.length; n++)
returnValue = func.invoke(returnValue, list[n]);
return returnValue;
}
These Java versions need generics adding to them, but I don't know how to do that in Java. But you should be able to pass them anonymous inner classes to provide the functors:
string[] names = getLotsOfNames();
string commaSeparatedNames = (string)reduce(names,
new IBinaryFunctor {
public object invoke(object arg1, object arg2)
{ return ((string)arg1) + ", " + ((string)arg2); }
}
Hopefully generics would get rid of the casts. The typesafe equivalent in C# is:
string commaSeparatedNames = names.Aggregate((a, b) => a + ", " + b);
Why is this "cool"? Simple ways of breaking up larger calculations into smaller pieces, so they can be put back together in different ways, are always cool. The way Google applies this idea is to parallelization, because both map and reduce can be shared out over several computers.
But the key requirement is NOT that your language can treat functions as values. Any OO language can do that. The actual requirement for parallelization is that the little func functions you pass to map and reduce must not use or update any state. They must return a value that is dependent only on the argument(s) passed to them. Otherwise, the results will be completely screwed up when you try to run the whole thing in parallel.
After getting most frustrated with either very long waffley or very short vague blog posts I eventually discovered this very good rigorous concise paper.
Then I went ahead and made it more concise by translating into Scala, where I've provided the simplest case where a user simply just specifies the map and reduce parts of the application. In Hadoop/Spark, strictly speaking, a more complex model of programming is employed that require the user to explicitly specify 4 more functions outlined here: http://en.wikipedia.org/wiki/MapReduce#Dataflow
import scalaz.syntax.id._
trait MapReduceModel {
type MultiSet[T] = Iterable[T]
// `map` must be a pure function
def mapPhase[K1, K2, V1, V2](map: ((K1, V1)) => MultiSet[(K2, V2)])
(data: MultiSet[(K1, V1)]): MultiSet[(K2, V2)] =
data.flatMap(map)
def shufflePhase[K2, V2](mappedData: MultiSet[(K2, V2)]): Map[K2, MultiSet[V2]] =
mappedData.groupBy(_._1).mapValues(_.map(_._2))
// `reduce` must be a monoid
def reducePhase[K2, V2, V3](reduce: ((K2, MultiSet[V2])) => MultiSet[(K2, V3)])
(shuffledData: Map[K2, MultiSet[V2]]): MultiSet[V3] =
shuffledData.flatMap(reduce).map(_._2)
def mapReduce[K1, K2, V1, V2, V3](data: MultiSet[(K1, V1)])
(map: ((K1, V1)) => MultiSet[(K2, V2)])
(reduce: ((K2, MultiSet[V2])) => MultiSet[(K2, V3)]): MultiSet[V3] =
mapPhase(map)(data) |> shufflePhase |> reducePhase(reduce)
}
// Kinda how MapReduce works in Hadoop and Spark except `.par` would ensure 1 element gets a process/thread on a cluster
// Furthermore, the splitting here won't enforce any kind of balance and is quite unnecessary anyway as one would expect
// it to already be splitted on HDFS - i.e. the filename would constitute K1
// The shuffle phase will also be parallelized, and use the same partition as the map phase.
abstract class ParMapReduce(mapParNum: Int, reduceParNum: Int) extends MapReduceModel {
def split[T](splitNum: Int)(data: MultiSet[T]): Set[MultiSet[T]]
override def mapPhase[K1, K2, V1, V2](map: ((K1, V1)) => MultiSet[(K2, V2)])
(data: MultiSet[(K1, V1)]): MultiSet[(K2, V2)] = {
val groupedByKey = data.groupBy(_._1).map(_._2)
groupedByKey.flatMap(split(mapParNum / groupedByKey.size + 1))
.par.flatMap(_.map(map)).flatten.toList
}
override def reducePhase[K2, V2, V3](reduce: ((K2, MultiSet[V2])) => MultiSet[(K2, V3)])
(shuffledData: Map[K2, MultiSet[V2]]): MultiSet[V3] =
shuffledData.map(g => split(reduceParNum / shuffledData.size + 1)(g._2).map((g._1, _)))
.par.flatMap(_.map(reduce))
.flatten.map(_._2).toList
}
Map is a native JS method that can be applied to an array. It creates a new array as a result of some function mapped to every element in the original array. So if you mapped a function(element) { return element * 2;}, it would return a new array with every element doubled. The original array would go unmodified.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/map
Reduce is a native JS method that can also be applied to an array. It applies a function to an array and has an initial output value called an accumulator. It loops through each element in the array, applies a function, and reduces them to a single value (which begins as the accumulator). It is useful because you can have any output you want, you just have to start with that type of accumulator. So if I wanted to reduce something into an object, I would start with an accumulator {}.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/Reduce?v=a