What are some uses of closures for OOP? - language-agnostic

PHP and .Net have closures; I have been wondering what are some examples of using closures in OOP and design patterns, and what advantages they have over pure OOP programming.
As a clarification, this is not a OOP vs. functional programming, but how to best use closures in a OOP design. How do closures fit in, say, factories or the observer pattern? What are some tricks you can pull which clarify the design and results in looser coupling, for example.

Closures are useful for event-handling. This example is a bit contrived, but I think it conveys the idea:
class FileOpener
{
public FileOpener(OpenFileTrigger trigger)
{
trigger.FileOpenTriggered += (sender, args) => { this.Open(args.PathToFile); };
}
public void Open(string pathToFile)
{
//…
}
}
my file opener can either open a file by directly calling instance.Open(pathToFile), or it can be triggered by some event. If I didn't have anonymous functions + closures, I'd have to write a method that had no other purpose than to respond to this event.

Any language that has closures can use them for trampolining, which is a technique for refactoring recursion into iteration. This can get you out of "stack overflow" problems that naive implementations of many algorithms run into.
A trampoline is a function that "bounces" a closure back up to its caller. The closure captures "the rest of the work".
For example, in Python you can define a recursive accumulator to sum the values in an array:
testdata = range(0, 1000)
def accum(items):
if len(items) == 0:
return 0
elif len(items) == 1:
return items[0]
else:
return items[0] + accum(items[1:])
print "will blow up:", accum(testdata)
On my machine, this craps out with a stack overflow when the length of items exceeds 998.
The same function can be done in a trampoline style using closures:
def accum2(items):
bounced = trampoline(items, 0)
while (callable(bounced)):
bounced = bounced()
return bounced
def trampoline(items, initval):
if len(items) == 0:
return initval
else:
return lambda: trampoline(items[1:], initval+items[0])
By converting recursion to iteration, you don't blow out the stack. The closure has the property of capturing the state of the computation in itself rather than on the stack as you do with recursion.

Suppose you want to provide a class with the ability to create any number of FileOpener instances, but following IoC principles, you don't want the class creating FileOpeners to actually know how to do so (in other words, you don't want to new them). Instead, you want to use dependency injection. However, you only want this class to be able to generate FileOpener instances, and not just any instance. Here's what you can do:
class AppSetup
{
private IContainer BuildDiContainer()
{
// assume this builds a dependency injection container and registers the types you want to create
}
public void setup()
{
IContainer container = BuilDiContainer();
// create a function that uses the dependency injection container to create a `FileOpener` instance
Func<FileOpener> getFileOpener = () => { return container.Resolve<FileOpener>(); };
DependsOnFileOpener dofo = new DependsOnFileOpener(getFileOpener);
}
}
Now you have your class that needs to be able to make FileOpener instances. You can use dependency injection to provide it with this capability, while retaining loose coupling
class DependsOnFileOpener()
{
public DependesOnFileOpener(Func<FileOpener> getFileOpener)
{
// this class can create FileOpener instances any time it wants, without knowing where they come from
FileOpener f = getFileOpener();
}
}

Related

Function variable and an array of functions in Chapel

In the following code, I'm trying to create a "function pointer" and an array of functions by regarding function names as usual variables:
proc myfunc1() { return 100; }
proc myfunc2() { return 200; }
// a function variable?
var myfunc = myfunc1;
writeln( myfunc() );
myfunc = myfunc2;
writeln( myfunc() );
// an array of functions?
var myfuncs: [1..2] myfunc1.type;
writeln( myfuncs.type: string );
myfuncs[ 1 ] = myfunc1;
myfuncs[ 2 ] = myfunc2;
for fun in myfuncs do
writeln( fun() );
which seems to be working as expected (with Chapel v1.16)
100
200
[domain(1,int(64),false)] chpl__fcf_type_void_int64_t
100
200
So I'm wondering whether the above usage of function variables is legitimate? For creating an array of functions, is it usual to define a concrete function with desired signature first and then refer to its type (with .type) as in the above example?
Also, is it no problem to treat such variables as "usual" variables, e.g., pass them to other functions as arguments or include them as a field of class/record? (Please ignore these latter questions if they are too broad...) I would appreciate any advice if there are potential pitfalls (if any).
This code is using first class function support, which is prototype/draft in the Chapel language design. You can read more about the prototype support in the First-class Functions in Chapel technote.
While many uses of first-class functions work in 1.16 and later versions, you can expect that the language design in this area will be revisited. In particular there isn't currently a reasonable answer to the question of whether or not variables can be captured (and right now attempting to do so probably results in a confusing error). I don't know in which future release this will change, though.
Regarding the myfunc1.type part, the section in the technote I referred to called "Specifying the type of a first-class function" presents an alternative strategy. However I don't see any problem with using myfunc1.type in this case.
Lastly, note that the lambda support in the current compiler actually operates by creating a class with a this method. So you can do the same - create a "function object" (to borrow a C++ term) - that has the same effect. A "function object" could be a record or a class. If it's a class, you might use inheritance to be able to create an array of objects that can respond to the same method depending on their dynamic type. This strategy might allow you to work around current issues with first class functions. Even if first-class-function support is completed, the "function object" approach allow you to be more explicit about captured variables. In particular, you might store them as fields in the class and set them in the class initializer. Here is an example creating and using an array of different types of function objects:
class BaseHandler {
// consider these as "pure virtual" functions
proc name():string { halt("base name called"); }
proc this(arg:int) { halt("base greet called"); }
}
class HelloHandler : BaseHandler {
proc name():string { return "hello"; }
proc this(arg:int) { writeln("Hello ", arg); }
}
class CiaoHandler : BaseHandler {
proc name():string { return "ciao"; }
proc this(arg:int) { writeln("Ciao ", arg); }
}
proc test() {
// create an array of handlers
var handlers:[1..0] BaseHandler;
handlers.push_back(new HelloHandler());
handlers.push_back(new CiaoHandler());
for h in handlers {
h(1); // calls 'this' method in instance
}
}
test();
Yes, in your example you are using Chapel's initial support for first-class functions. To your second question, you could alternatively use a function type helper for the declaration of the function array:
var myfuncs: [1..2] func(int);
These first-class function objects can be passed as arguments into functions – this is how Futures.async() works – or stored as fields in a record (Try It Online! example). Chapel's first-class function capabilities also include lambda functions.
To be clear, the "initial" aspect of this support comes with the caveat (from the documentation):
This mechanism should be considered a stopgap technology until we have developed and implemented a more robust story, which is why it's being described in this README rather than the language specification.

Can you navigate the contents of a Vector's index via a String?

Is it possible to do something similar to this in Haxe?
private var _bindingsFiltered:Vector<String>;
_bindingsFiltered = new Vector<String>();
_controller_touched_binding.action = "meta_start";
What I would like to be able to do:
_bindingsFiltered[_controller_touched_binding.action] = "BUTTON_13";
trace(_bindingsFiltered["meta_start"]); //result: "BUTTON_13"
I want to be able to override a specific index too (still accessed via a string), with a new value, rather than keep pushing new content to the end of the vector. I have been using 'openfl.utils.Object' to cheat for now but I am looking for a more reliable approach for the long run.
Is there a way to do this in Haxe?
If not, what are my options?
I would also be interested in a solution for this in AS3, if there is one (avoiding the Array class).
My goal is to find a method that I can use in both languages seamlessly (next-to-none, differences).
Vector's cannot be indexed by string in Haxe. A vector is an array with a fixed size. This is the Haxe manual on that subject.
Instead of vectors, you can use a Map.
class Test {
private var vector:Map<String, String> = new Map<String, String>();
public function new() {
var str = 'haxe';
vector[str] = "is great";
trace(vector[str]);
}
static function main() {
new Test();
}
}
https://try.haxe.org/#F74Ba
I think you could do this using flash.utils.Dictionary:
ActionScript
import flash.utils.Dictionary;
...
var _bindingsFiltered:Dictionary = new Dictionary ();
_bindingsFiltered[_controller_touched_binding.action] = "BUTTON_13";
trace(_bindingsFiltered["meta_start"]); //result: "BUTTON_13"
Haxe
import openfl.utils.Dictionary;
...
var _bindingsFiltered = new Dictionary<String, String> ();
_bindingsFiltered[_controller_touched_binding.action] = "BUTTON_13";
trace(_bindingsFiltered["meta_start"]); //result: "BUTTON_13"
First, do you really want an array / vector / list, or do you really want a hashmap of key / value pairs? How are you using the collection? Why do you want String keys? And related, is this mostly about access symantics (you want to type it this way), or are the runtime reasons you'd want to use strings (serialization / etc)?
From what you've described, it sounds like what you really want is an Object like the ones in AS3/JS/ECMAScript, with square-bracket access symantics -- obj[key]
Yes, you can do that in Haxe. The openfl.utils.Object class is a helper to do exactly this, using Dynamic objects and reflection. It should compile to exactly what you want on all Haxe targets.
In any case, if you'd like to feel like you're not bound to OpenFL, no problem. Copy the openfl/utils/Object.hx file and place it anywhere you like in your project's class path (and update the package statement).
There's nothing particularly OpenFL-ish about that code. It's pure Haxe code with no dependencies. It provides array access with String keys, as well as toString, toLocaleString, propertyIsEnumerable, iterator, isPrototypeOf, and hasOwnProperty functions (which ECMA-folk are used to.)
The transition from AS3/JS to Haxe is a little weird, especially when it comes to dynamic objects, and I've been meaning to blog more about it. ;) Good luck!
ETA: In truth, you probably want to get away from Dynamic/Reflection, and embrace a more type-strict approach. AS3/JS devs don't understand this at first, but it is where the benefits of Haxe come from. If you don't then your Haxe experience is likely to be unplesant.
Short answer: yes, you can.
abstract MyVector<T>(Vector<T>) {
public function new(l:Int) this = new Vector<T>(l);
#:op([]) public function set<K:T>(s:String, v:K) {
switch (s) {
case "FIRST": this[0] = v;
case "SECOND": this[1] = v;
default: return;
}
}
#:op([]) public function get(s:String) {
switch (s) {
case "FIRST": return this[0];
case "SECOND": return this[1];
default: return cast 0;
}
}
}
var mv = new MyVector<String>(2);
mv["SECOND"] = "Second";
trace(mv["SECOND"]); // outputs Second
You can inline get and set methods if you want.

Improvements to a custom scala recursion prevention mechanisem

I would like to create a smart recursion prevention mechanism. I would like to be able to annotate a piece of code somehow, to mark that it should not be executed in recursion, and if it is indeed executed in recursion, then I want to throw a custom error (which can be caught to allow executing custom code when this happens)
Here is my attempt until here:
import scala.collection.mutable.{Set => MutableSet, HashSet => MutableHashSet }
case class RecursionException(uniqueID:Any) extends Exception("Double recursion on " + uniqueID)
object Locking {
var locks:MutableSet[Any] = new MutableHashSet[Any]
def acquireLock (uniqueID:Any) : Unit = {
if (! (locks add uniqueID))
throw new RecursionException(uniqueID)
}
def releaseLock (uniqueID:Any) : Unit = {
locks remove uniqueID
}
def lock1 (uniqueID:Any, f:() => Unit) : Unit = {
acquireLock (uniqueID)
try {
f()
} finally {
releaseLock (uniqueID)
}
}
def lock2[T] (uniqueID:Any, f:() => T) : T = {
acquireLock (uniqueID)
try {
return f()
} finally {
releaseLock (uniqueID)
}
}
}
and now to lock a code segment I do:
import Locking._
lock1 ("someID", () => {
// Custom code here
})
My questions are:
Is there any obvious way to get rid of the need for hard coding a unique identifier? I need a unique identifier which will actually be shared between all invocations of the function containing the locked section (so I can't have something like a counter for generating unique values, unless somehow scala has static function variables). I thought on somehow
Is there any way to prettify the syntax of the anonymouse function? Specifically, something that will make my code look like lock1 ("id") { /* code goes here */ } or any other prettier look.
A bit silly to ask in this stage, but I'll ask anyway - Am I re-inventing the wheel? (i.e. does something like this exist?)
Wild final thought: I know that abusing the synchronized keyword (at least in java) can gaurantee that there would be only one execution of the code (in the sense that no multiple threads can enter that part of the code at the same time). I don't think it prevents from the same thread to execute the code twice (although I may be wrong here). Anyway, if it does prevent it, I still don't want it (even thoug my program is single threaded) since I'm pretty sure it will lead to a deadlock and won't report an exception.
Edit: Just to make it clearer, this project is for error debugging purposes and for learning scala. It has no real useage other than easily finding code errors at runtime (for detecting recursion where it shouldn't happen). See the comments to this post.
Not quite sure what you're aiming at, but a few remarks:
First, you do not need to do lock1 and lock2 to distinguish Unit and the other type. Unit is a proper value type, the generic method will work for it too. Also, you should probably use a call by name argument => T, rather than a function () => T, and use two argument lists:
def lock[T] (uniqueID:Any)(f: => T) : T = {
acquireLock (uniqueID)
try {
f
} finally {
releaseLock (uniqueID)
}
}
Then you can call with lock(id){block} and it looks like common instructions such as if or synchronized.
Second, why do you need a uniqueId, why make Lock a singleton? Instead, make Lock a class, an have as many instances as you would have had ids.
class Lock {
def lock[T](f: => T): T = {acquireLock() ...}
}
(You may even name your lock method apply, so you can just do myLock{....} rather than myLock.lock{...})
Multithreading aside, you now just need a Boolean var for acquire/releaseLock
Finally, if you need to support multithreading, you have to decide whether several thread can enter the lock (that would not be recursion). If they can, the boolean should be replaced with a DynamicVariable[Boolean] (or maybe a java ThreadLocal, as DynamicVariable is an InheritableThreadLocal, which you may or may not want). If they cannot, you just need to synchronize access in acquire/releaseLock.
Is there any obvious way to get rid of the need for hard coding a unique identifier?
Since for what you said on the comments this is not prod code, I guess you could use the functions hashCode property like this:
def lock1 (f:() => Unit) : Unit = {
acquireLock (f.hashCode)
try {
f()
} finally {
releaseLock (f.hashCode)
}
Is there any way to prettify the syntax of the anonymouse function?
With the before-mentioned change the syntax should be prettier:
lock1 {
If you're planning on keeping the identifier (if hashcode doesn't cut it for you) you can define your method like this:
def lock1 (uniqueID:Any)(f:() => Unit) : Unit = {
That will let you call the lock1 method with:
lock("foo") {
}
Cheers!

Actionscript 3.0 Best Option for Subclassing Vector Class (Flash Player 10)

I would like to take advantage of all the goodness of the newer Vector class for FP10, but it seems it is marked as final.
I am doing some intensive mathematical processing in Actionscript, and repeatedly process arrays of Numbers. I have previously been using my own subclass of Array(I call it NumericArray), with added functions such as sum(), mean(), add(), multiply(), etc. This works very well and allows for some clean OO code. However, I am finding through profiling that about 95% of my processing time occurs in the functions of these objects. I need more performance out of these arrays.
I want to use a Vector, as it provides some performance enhancements. I want to specifically use a Vector.<Number>. Unfortunately, I cannot subclass Vector as it is marked final.
What is the best and cleanest way to imitate what I was previously doing with a subclass of Array, to a Vector.<Number>?
I have thought about passing around Vector.<Number> variables instead of my custom class and just using utility functions to manipulate, but this is not good OO design and will be a pain to use, not to mention ugly.
If adding your additional functionality doesn't require access to protected properties/methods of Vector, you could create a wrapper class for the Vector. Something along these lines?
import flash.utils.Proxy;
import flash.utils.flash_proxy;
use namespace flash_proxy;
public class NumericVector extends Proxy
{
private var vector:Vector.<Number>;
public function NumericVector(vector:Vector.<Number> = null)
{
if(vector == null)
{
this.vector = new Vector.<Number>();
}
else
{
this.vector = vector;
}
}
override flash_proxy function nextName(index:int):String
{
return vector[index - 1].toString();
}
override flash_proxy function nextNameIndex(index:int):int
{
// implementation
}
public function sum():Number
{
// do whatever you intend to do
}
...
}
A way to sidestep this issue might be to use the as3ds (short for actionscript 3 data structures). Whether they can be faster than using Vector, I'm not sure.
How come on this page
http://help.adobe.com/en_US/AS3LCR/Flash_10.0/Vector.html
it says:
"Note: To override this method in a subclass of Vector, use ...args for the parameters, as this example shows:"
??
doesn't that imply that you can subclass a Vector?
James

What is Map/Reduce?

I hear a lot about map/reduce, especially in the context of Google's massively parallel compute system. What exactly is it?
From the abstract of Google's MapReduce research publication page:
MapReduce is a programming model and
an associated implementation for
processing and generating large data
sets. Users specify a map function
that processes a key/value pair to
generate a set of intermediate
key/value pairs, and a reduce function
that merges all intermediate values
associated with the same intermediate
key.
The advantage of MapReduce is that the processing can be performed in parallel on multiple processing nodes (multiple servers) so it is a system that can scale very well.
Since it's based from the functional programming model, the map and reduce steps each do not have any side-effects (the state and results from each subsection of a map process does not depend on another), so the data set being mapped and reduced can each be separated over multiple processing nodes.
Joel's Can Your Programming Language Do This? piece discusses how understanding functional programming was essential in Google to come up with MapReduce, which powers its search engine. It's a very good read if you're unfamiliar with functional programming and how it allows scalable code.
See also: Wikipedia: MapReduce
Related question: Please explain mapreduce simply
Map is a function that applies another function to all the items on a list, to produce another list with all the return values on it. (Another way of saying "apply f to x" is "call f, passing it x". So sometimes it sounds nicer to say "apply" instead of "call".)
This is how map is probably written in C# (it's called Select and is in the standard library):
public static IEnumerable<R> Select<T, R>(this IEnumerable<T> list, Func<T, R> func)
{
foreach (T item in list)
yield return func(item);
}
As you're a Java dude, and Joel Spolsky likes to tell GROSSLY UNFAIR LIES about how crappy Java is (actually, he's not lying, it is crappy, but I'm trying to win you over), here's my very rough attempt at a Java version (I have no Java compiler, and I vaguely remember Java version 1.1!):
// represents a function that takes one arg and returns a result
public interface IFunctor
{
object invoke(object arg);
}
public static object[] map(object[] list, IFunctor func)
{
object[] returnValues = new object[list.length];
for (int n = 0; n < list.length; n++)
returnValues[n] = func.invoke(list[n]);
return returnValues;
}
I'm sure this can be improved in a million ways. But it's the basic idea.
Reduce is a function that turns all the items on a list into a single value. To do this, it needs to be given another function func that turns two items into a single value. It would work by giving the first two items to func. Then the result of that along with the third item. Then the result of that with the fourth item, and so on until all the items have gone and we're left with one value.
In C# reduce is called Aggregate and is again in the standard library. I'll skip straight to a Java version:
// represents a function that takes two args and returns a result
public interface IBinaryFunctor
{
object invoke(object arg1, object arg2);
}
public static object reduce(object[] list, IBinaryFunctor func)
{
if (list.length == 0)
return null; // or throw something?
if (list.length == 1)
return list[0]; // just return the only item
object returnValue = func.invoke(list[0], list[1]);
for (int n = 1; n < list.length; n++)
returnValue = func.invoke(returnValue, list[n]);
return returnValue;
}
These Java versions need generics adding to them, but I don't know how to do that in Java. But you should be able to pass them anonymous inner classes to provide the functors:
string[] names = getLotsOfNames();
string commaSeparatedNames = (string)reduce(names,
new IBinaryFunctor {
public object invoke(object arg1, object arg2)
{ return ((string)arg1) + ", " + ((string)arg2); }
}
Hopefully generics would get rid of the casts. The typesafe equivalent in C# is:
string commaSeparatedNames = names.Aggregate((a, b) => a + ", " + b);
Why is this "cool"? Simple ways of breaking up larger calculations into smaller pieces, so they can be put back together in different ways, are always cool. The way Google applies this idea is to parallelization, because both map and reduce can be shared out over several computers.
But the key requirement is NOT that your language can treat functions as values. Any OO language can do that. The actual requirement for parallelization is that the little func functions you pass to map and reduce must not use or update any state. They must return a value that is dependent only on the argument(s) passed to them. Otherwise, the results will be completely screwed up when you try to run the whole thing in parallel.
After getting most frustrated with either very long waffley or very short vague blog posts I eventually discovered this very good rigorous concise paper.
Then I went ahead and made it more concise by translating into Scala, where I've provided the simplest case where a user simply just specifies the map and reduce parts of the application. In Hadoop/Spark, strictly speaking, a more complex model of programming is employed that require the user to explicitly specify 4 more functions outlined here: http://en.wikipedia.org/wiki/MapReduce#Dataflow
import scalaz.syntax.id._
trait MapReduceModel {
type MultiSet[T] = Iterable[T]
// `map` must be a pure function
def mapPhase[K1, K2, V1, V2](map: ((K1, V1)) => MultiSet[(K2, V2)])
(data: MultiSet[(K1, V1)]): MultiSet[(K2, V2)] =
data.flatMap(map)
def shufflePhase[K2, V2](mappedData: MultiSet[(K2, V2)]): Map[K2, MultiSet[V2]] =
mappedData.groupBy(_._1).mapValues(_.map(_._2))
// `reduce` must be a monoid
def reducePhase[K2, V2, V3](reduce: ((K2, MultiSet[V2])) => MultiSet[(K2, V3)])
(shuffledData: Map[K2, MultiSet[V2]]): MultiSet[V3] =
shuffledData.flatMap(reduce).map(_._2)
def mapReduce[K1, K2, V1, V2, V3](data: MultiSet[(K1, V1)])
(map: ((K1, V1)) => MultiSet[(K2, V2)])
(reduce: ((K2, MultiSet[V2])) => MultiSet[(K2, V3)]): MultiSet[V3] =
mapPhase(map)(data) |> shufflePhase |> reducePhase(reduce)
}
// Kinda how MapReduce works in Hadoop and Spark except `.par` would ensure 1 element gets a process/thread on a cluster
// Furthermore, the splitting here won't enforce any kind of balance and is quite unnecessary anyway as one would expect
// it to already be splitted on HDFS - i.e. the filename would constitute K1
// The shuffle phase will also be parallelized, and use the same partition as the map phase.
abstract class ParMapReduce(mapParNum: Int, reduceParNum: Int) extends MapReduceModel {
def split[T](splitNum: Int)(data: MultiSet[T]): Set[MultiSet[T]]
override def mapPhase[K1, K2, V1, V2](map: ((K1, V1)) => MultiSet[(K2, V2)])
(data: MultiSet[(K1, V1)]): MultiSet[(K2, V2)] = {
val groupedByKey = data.groupBy(_._1).map(_._2)
groupedByKey.flatMap(split(mapParNum / groupedByKey.size + 1))
.par.flatMap(_.map(map)).flatten.toList
}
override def reducePhase[K2, V2, V3](reduce: ((K2, MultiSet[V2])) => MultiSet[(K2, V3)])
(shuffledData: Map[K2, MultiSet[V2]]): MultiSet[V3] =
shuffledData.map(g => split(reduceParNum / shuffledData.size + 1)(g._2).map((g._1, _)))
.par.flatMap(_.map(reduce))
.flatten.map(_._2).toList
}
Map is a native JS method that can be applied to an array. It creates a new array as a result of some function mapped to every element in the original array. So if you mapped a function(element) { return element * 2;}, it would return a new array with every element doubled. The original array would go unmodified.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/map
Reduce is a native JS method that can also be applied to an array. It applies a function to an array and has an initial output value called an accumulator. It loops through each element in the array, applies a function, and reduces them to a single value (which begins as the accumulator). It is useful because you can have any output you want, you just have to start with that type of accumulator. So if I wanted to reduce something into an object, I would start with an accumulator {}.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/Reduce?v=a