algorithms: Implementing custom hash table based dict - actionscript-3

I am learning programming abstract data types. Trying to build custom hash table based dict.
SO far I've created a class place holder.
public class HashMapDict implements IDict
{
private var _map:Array;
public function HashMapDict()
{
_map = new Array();
//TODO: implement function
}
public function set(keys:Array):Boolean
{
// 1. For each key in array of keys
// 2. Pass Key.key to hash function
// 3. Write Key to _map[hash(Key.key)]
return true;
}
}
I see the main method set doing the following
// 1. For each key in array of keys
// 2. Pass Key.key to hash function
// 3. Write Key to _map[hash(Key.key)]
What I am thinking about is to use cryptography libs for hash generation. But I am a bit confused with how it should work. e.g. Tried to look on several libs like as3crypto (http://crypto.hurlant.com/demo/) and it seems to produce hash in a way I don't really think can be used for indexes in arrays.
E.g.
http://screencast.com/t/bE1lYQEqp4D
Can you advise which lib can I use to generate usable hashes? and how should they look like

Just as a heads up -- I can almost guarantee that you will not be able to make something better than Dictionary or even Object at this. Your proposed plan could work, but it would offer no benefit over these. I also feel compelled to suggest Vector over Array as Vectors are faster and more powerful.
The problem with Hash libs is that they generally result in very, very large numbers. MD5, for example, will produce a hex string which represents far more than what can fit even into a uint (uint in as can fit 2 ^ 32, MD5 is 2^128). This also happens to be the maximum size of an Array/Vector in AS.
This isn't to say that they can't fit into Number (which can hold about 1.79*10^308), but it does mean that you'll lose the benefit of numeric indexing and you certainly won't get much benefit from Vectors at that point. You'll basically be falling back on Object.
To be honest, it really does look like you have one of two options. Either you can implement a direct lookup using a second Array/Vector. This has the problem of being O(n) lookup time while the lookup time of a Hash table would be O(1).
It seems, at least to me, that you'll need to use Dictionary or Object no matter what to get this done.

For implementation of a hash table, a cryptographic hash function is overkill.
Use this only if you are concerned with an attack of someone who tries to feed you bad data (e.g. keys with lots of hash collisions) to make the hash table slow.
For a hash table use, a hash function like the following one is enough (pseudocode, as I don't know the right syntax):
hash = 0
for c in string:
hash = hash * 13 + c;
return hash
But as other answers said, there is already a hash table built in, and you don't really need to reimplement it.

I might be missing something, but I think you should look at flash.utils::Dictionary.
It makes hashing obsolete. If you must have some sort of primitive key, I suggest using the following:
class UIDUtil {
static private var map:Dictionary = new Dictionary(true);
static private var counter:int = 0;
static public function getUID(value:*):int {
return map[value] ||= counter++;
}
}
But your class I would implement as:
public class HashMapDict implements IDict {
private var _map:Dictionary = new Dictionary();
public function set(keys:Array):Boolean {
for each (var key:* in keys) _map[key] = key;
return true;
}
}
I am not sure of its purpose though ;)

Related

Can you navigate the contents of a Vector's index via a String?

Is it possible to do something similar to this in Haxe?
private var _bindingsFiltered:Vector<String>;
_bindingsFiltered = new Vector<String>();
_controller_touched_binding.action = "meta_start";
What I would like to be able to do:
_bindingsFiltered[_controller_touched_binding.action] = "BUTTON_13";
trace(_bindingsFiltered["meta_start"]); //result: "BUTTON_13"
I want to be able to override a specific index too (still accessed via a string), with a new value, rather than keep pushing new content to the end of the vector. I have been using 'openfl.utils.Object' to cheat for now but I am looking for a more reliable approach for the long run.
Is there a way to do this in Haxe?
If not, what are my options?
I would also be interested in a solution for this in AS3, if there is one (avoiding the Array class).
My goal is to find a method that I can use in both languages seamlessly (next-to-none, differences).
Vector's cannot be indexed by string in Haxe. A vector is an array with a fixed size. This is the Haxe manual on that subject.
Instead of vectors, you can use a Map.
class Test {
private var vector:Map<String, String> = new Map<String, String>();
public function new() {
var str = 'haxe';
vector[str] = "is great";
trace(vector[str]);
}
static function main() {
new Test();
}
}
https://try.haxe.org/#F74Ba
I think you could do this using flash.utils.Dictionary:
ActionScript
import flash.utils.Dictionary;
...
var _bindingsFiltered:Dictionary = new Dictionary ();
_bindingsFiltered[_controller_touched_binding.action] = "BUTTON_13";
trace(_bindingsFiltered["meta_start"]); //result: "BUTTON_13"
Haxe
import openfl.utils.Dictionary;
...
var _bindingsFiltered = new Dictionary<String, String> ();
_bindingsFiltered[_controller_touched_binding.action] = "BUTTON_13";
trace(_bindingsFiltered["meta_start"]); //result: "BUTTON_13"
First, do you really want an array / vector / list, or do you really want a hashmap of key / value pairs? How are you using the collection? Why do you want String keys? And related, is this mostly about access symantics (you want to type it this way), or are the runtime reasons you'd want to use strings (serialization / etc)?
From what you've described, it sounds like what you really want is an Object like the ones in AS3/JS/ECMAScript, with square-bracket access symantics -- obj[key]
Yes, you can do that in Haxe. The openfl.utils.Object class is a helper to do exactly this, using Dynamic objects and reflection. It should compile to exactly what you want on all Haxe targets.
In any case, if you'd like to feel like you're not bound to OpenFL, no problem. Copy the openfl/utils/Object.hx file and place it anywhere you like in your project's class path (and update the package statement).
There's nothing particularly OpenFL-ish about that code. It's pure Haxe code with no dependencies. It provides array access with String keys, as well as toString, toLocaleString, propertyIsEnumerable, iterator, isPrototypeOf, and hasOwnProperty functions (which ECMA-folk are used to.)
The transition from AS3/JS to Haxe is a little weird, especially when it comes to dynamic objects, and I've been meaning to blog more about it. ;) Good luck!
ETA: In truth, you probably want to get away from Dynamic/Reflection, and embrace a more type-strict approach. AS3/JS devs don't understand this at first, but it is where the benefits of Haxe come from. If you don't then your Haxe experience is likely to be unplesant.
Short answer: yes, you can.
abstract MyVector<T>(Vector<T>) {
public function new(l:Int) this = new Vector<T>(l);
#:op([]) public function set<K:T>(s:String, v:K) {
switch (s) {
case "FIRST": this[0] = v;
case "SECOND": this[1] = v;
default: return;
}
}
#:op([]) public function get(s:String) {
switch (s) {
case "FIRST": return this[0];
case "SECOND": return this[1];
default: return cast 0;
}
}
}
var mv = new MyVector<String>(2);
mv["SECOND"] = "Second";
trace(mv["SECOND"]); // outputs Second
You can inline get and set methods if you want.

What is the difference between Set,Map,WeakSet,WeakMap in ES6? [duplicate]

There is already some questions about map and weak maps, like this: What's the difference between ES6 Map and WeakMap? but I would like to ask in which situation should I favor the use of these data structures? Or what should I take in consideration when I favor one over the others?
Examples of the data structures from:https://github.com/lukehoban/es6features
// Sets
var s = new Set();
s.add("hello").add("goodbye").add("hello");
s.size === 2;
s.has("hello") === true;
// Maps
var m = new Map();
m.set("hello", 42);
m.set(s, 34);
m.get(s) == 34;
// Weak Maps
var wm = new WeakMap();
wm.set(s, { extra: 42 });
wm.size === undefined
// Weak Sets
var ws = new WeakSet();
ws.add({ data: 42 });
// Because the added object has no other references, it will not be held in the set
Bonus. Which of the above data structures will produce the same/similar result of doing: let hash = object.create(null); hash[index] = something;
This is covered in ยง23.3 of the specification:
If an object that is being used as the key of a WeakMap key/value pair is only reachable by following a chain of references that start within that WeakMap, then that key/value pair is inaccessible and is automatically removed from the WeakMap.
So the entries in a weak map, if their keys aren't referenced by anything else, will be reclaimed by garbage collection at some point.
In contrast, a Map holds a strong reference to its keys, preventing them from being garbage-collected if the map is the only thing referencing them.
MDN puts it like this:
The key in a WeakMap is held weakly. What this means is that, if there are no other strong references to the key, then the entire entry will be removed from the WeakMap by the garbage collector.
And WeakSet does the same.
...in which situation should I favor the use of this data structures?
Any situation where you don't want the fact you have a map/set using a key to prevent that key from being garbage-collected. Here are some examples:
Having instance-specific information which is truly private to the instance, which looks like this: (Note: This example is from 2015, well before private fields were an option. Here in 2021, I'd use private fields for this.)
let Thing = (() => {
var privateData = new WeakMap();
class Thing {
constructor() {
privateData[this] = {
foo: "some value"
};
}
doSomething() {
console.log(privateData[this].foo);
}
}
return Thing;
})();
There's no way for code outside that scoping function to access the data in privateData. That data is keyed by the instance itself. You wouldn't do that without a WeakMap because it would be a memory leak, your Thing instances would never be cleaned up. But WeakMap only holds weak references, and so if your code using a Thing instance is done with it and releases its reference to the instance, the WeakMap doesn't prevent the instance from being garbage-collected; instead, the entry keyed by the instance is removed from the map.
Holding information for objects you don't control. Suppose you get an object from some API and you need to remember some additional information about that object. You could add properties to the object itself (if it's not sealed), but adding properties to objets outside of your control is just asking for trouble. Instead, you can use a WeakMap keyed by the object to store your extra information.
One use case for WeakSet is tracking or branding: Suppose that before "using" an object, you need to know whether that object has ever been "used" in the past, but without storing that as a flag on the object (perhaps because if it's a flag on the object, other code can see it [though you could use a private field to prevent that]; or because it's not your object [so private fields wouldn't help]). For instance, this might be some kind of single-use access token. A WeakSet is a simple way to do that without forcing the object to stay in memory.
Which of the above data structures will produce the same/similar result of doing: let hash = Object.create(null); hash[index] = something;
That would be nearest to Map, because the string index (the property name) will be held by a strong reference in the object (it and its associated property will not be reclaimed if nothing else references it).

In AS3, where do you draw the line between Dictionary and ArrayCollection?

Basically I have been using a Dictionary object in my program that basically took ints as its keys and stored RTMFP peer IDs in the appropriate locations. Each int was unique and represented one user.
Now I'm needing to expand on this where users are identified by a combination of the int and a Boolean value, kind of like this:
private var m_iUID:int;
private var m_blnIsCurrent:Boolean;
Only the combination between those two really uniquely identifies the user. That being said I was just about to use a new class made out of this for the Dictionary keys; but then it occurred to me that instead of doing it this way, I could just add the peer ID to the class definition and turn the Dictionary object into an ArrayCollection:
private var m_iUID:int;
private var m_blnIsCurrent:Boolean;
public var m_strNearID:String;
So now I'm wondering which is really better in this scenario. And that question has led to a bigger question: where do you really draw the line between these two collection types in general? They're suddenly starting to not seem all that different after all, except where you're trying to avoid messing with class definitions. I guess I'm really asking for advice about both the specific scenario and the general question. Thanks!
ArrayCollection is just a wrapper for an Array, and is only available in Flex.
In AS3 you really have 3 fundamental hash table types: Array, Object, and Dictionary. You choose which one to use based on the type of key you want to use: an integer, a string, or an object reference. Arrays will convert any key to an int, Object will convert any key to a string. Dictionary works like Object for string keys (and will convert primitives to a string) but what it is really good at is using object references as keys.
It you want to use a single int as the unique key, use an array. If you want to use a single string as the unique key, use an object. If you want to use object references as the unique key, use a Dictionary.
In your case you should probably use an Object, and a custom toString() method on your "key" class. This is because you want to use a composite of primitive values (NOT an object reference) as your unique key. There is no way to do this natively, so you'll have to mash the values together as a single string. Objects are the best (fastest) hash table for string keys, so that is the collection you should use.
Example:
class User {
private var m_iUID:int;
private var m_blnIsCurrent:Boolean;
public var m_strNearID:String;
public function User(UID:int, IsCurrent:Boolean) {
m_iUID = UID;
m_blnIsCurrent = IsCurrent;
}
// Custom toString to mash together primitives
public function toString() {
return m_iUID.toString() + "-" + (m_blnIsCurrent ? "1" : "0");
}
}
// Later:
var allUsers:Object = {}
var user1:User = new User(231049, true);
var user2:User = new User(0x2309, false);
// Implicitly calls toString():
allUsers[user1] = "User 1";
allUsers[user2] = "User 2";
// All of the following will successfully retrieve the value for user1 ("User 1"):
// ONLY the first would work if allUsers was a Dictionary
trace(allUsers[user1]);
trace(allUsers[user1.toString()]);
trace(allUsers["231049-1"]);
trace(allUsers[new User(231049, true)]);
Dictionary and ArrayCollection have some important differences:
Dictionary maps objects to other objects, while ArrayCollection is just a list of objects.
ArrayCollection is Flex only, so unusable in a generic AS3 project.
Which one you should use really depends on what you need in your app:
Will you be using the "identity" object (with user id and "is current") somewhere else, without an associated peer id? In that case, make it a separate Identity class or so and use a Dictionary to map Identity instances to peer ids.
Do you need to perform lookups based on identities? In other words, do you need to ask "which peer id is associated with this identity?". If so, go for Dictionary + Identity once more, to avoid looping through a list instead.
I'm sure there are more considerations, but these should get you started.

Creating a "true" HashMap implementation with Object Equality in ActionScript 3

I've been spending some of my spare time working a set of collections for ActionScript 3 but I've hit a pretty serious roadblock thanks for the way ActionScript 3 handles equality checks inside Dictionary Objects.
When you compare a key in a dictionary, ActionScript uses the === operator to perform the comparison, this has a bit of a nasty side effect whereby only references to the same instance will resolve true and not objects of equality. Here's what I mean:
const jonny1 : Person = new Person("jonny", 26);
const jonny2 : Person = new Person("jonny", 26);
const table : Dictionary = new Dictionary();
table[jonny1] = "That's me";
trace(table[jonny1]) // traces: "That's me"
trace(table[jonny2]) // traces: undefined.
The way I am attempting to combat this is to provide an Equalizer interface which looks like this:
public interface Equalizer
{
function equals(object : Object) : Boolean;
}
This allows to to perform an instanceOf-esq. check whenever I need to perform an equality operation inside my collections (falling back on the === operator when the object doesn't implement Equalizer); however, this doesn't get around the fact that my underlying datastructure (the Dictionary Object) has no knowledge of this.
The way I am currently working around the issue is by iterating through all the keys in the dictionary and performing the equality check whenever I perform a containsKey() or get() operation - however, this pretty much defeats the entire point of a hashmap (cheap lookup operations).
If I am unable to continue using a Dictionary instance as the backing for map, how would I go about creating the hashes for unique object instances passed in as keys so I can still maintain equality?
How about you compute a hash code for your objects when you insert them, and then look them up by the hash code in your backing dictionary? The hashcode should compare === just fine. Of course, that would require you to have a Hashable interface for your object types instead of your Equalizer interface, so it isn't much less work than you are already doing, but you do get the cheap lookups.
How about rather doing this:
public interface Hashable {
function hash():String;
}
personally, I ask myself, why you want to do this ... hashing objects to obtain keys makes little sense if they are mutable ...
also, you might consider using a different approach, as for example this factory:
package {
public class Person {
/**
* don't use this!
* #private
*/
public function Person(name:String, age:int) {
if (!instantiationAllowed)
throw new Error("use Person.getPerson instead of constructor");
//...
}
private static var instantiationAllowed:Boolean = false;
private static var map:Object = {};
private static function create(name:String, age:int):Person {
instantiationAllowed = true;
var ret:Person = new Person(name, age);
instantiationAllowed = false;
}
public static function getPerson(name:String, age:int):Person {
var ageMap:Array = map[name];
if (ageMap == null) {
map[name] = ageMap = [];
return ageMap[age] = Person.create(name, age);
}
if (ageMap.hasOwnProperty(age))
return ageMap[age];
return ageMap[age] = Person.create(name, age);
}
}
}
it ensures, there's only one person with a given name and age (if that makes any sense) ...
Old thread I know, but still worth posting.
const jonny1 : Person = new Person("jonny", 26); const jonny2 : Person = new Person("jonny", 26);
is creating two completely different objects that will not compare using ==, guess I don't see why it's any more of a road block because of as3
The problem with AS3/JavaScript/EcmaScript is not that they create two different, equivalent objects.
The problem is that they cannot equate those two equivalent objects--only identity works, since there is no equals or hashCode methods that can be overriden with class-specific comparison logic.
For Map implementations such as dynamic Object or Dictionary, this means that you have to either use Strings or references as keys: you cannot recover objects from a map using different but equivalent objects.
To work around that problem, people either resort to strict toString implementations (for Object maps) which is undesirable, or to instance control for Dictionaries, as in #back2dos example, which introduces different problems (Also, note that #back2dos solution does not really guarantee unique Person instances since there is a time window during which asynchronous threads will be allowed to instantiate new Persons).
#A.Levy's solution is good except that in general, hashCodes are not strictly required to issue unique values (they are meant to map entries to buckets allowing for fast lookups, wherein fine-grained differentiation is done through equals method).
You need both a hashCode and an equals method, e.g.
public interface IEquable
{
function equals(object : Object) : Boolean;
function hash():String;
}
In any programming language,
const jonny1 : Person = new Person("jonny", 26);
const jonny2 : Person = new Person("jonny", 26);
is creating two completely different objects that will not compare using ==, guess I don't see why it's any more of a road block because of as3

What's the cleanest way to simulate pass-by-reference in Actionscript 3.0?

Actionscript 3.0 (and I assume Javascript and ECMAScript in general) lacks pass-by-reference for native types like ints. As a result I'm finding getting values back from a function really clunky. What's the normal pattern to work around this?
For example, is there a clean way to implement swap( intA, intB ) in Actionscript?
I Believe the best you can do is pass a container object as an argument to a function and change the values of some properties in that object:
function swapAB(aValuesContainer:Object):void
{
if (!(aValuesContainer.hasOwnProperty("a") && aValuesContainer.hasOwnProperty("b")))
throw new ArgumentError("aValuesContainer must have properties a and b");
var tempValue:int = aValuesContainer["a"];
aValuesContainer["a"] = aValuesContainer["b"];
aValuesContainer["b"] = tempValue;
}
var ints:Object = {a:13, b:25};
swapAB(ints);
I suppose an alternative would be somewhere defining this sort of thing ...
public class Reference {
public var value:*;
}
Then use functions that take some number of Reference arguments to act as "pointers" if you're really just looking for "out" parameters and either initialize them on the way in or not and your swap would become:
function swap(Reference a, Reference b) {
var tmp:* = a.value;
a.value = b.value;
b.value = tmp;
}
And you could always go nuts and define specific IntReference, StringReference, etc.
This is nitpicking, but int, String, Number and the others are passed by reference, it's just that they are immutable. Of course, the effect is the same as if they were passed by value.
You could also use a wrapper instead of int:
public class Integer
{
public var value:int;
public function Integer(value:int)
{
this.value = value;
}
}
Of course, this would be more useful if you could use operator overloading...
Just look at some Java code. Java has had the convention that reference types are passed by reference and primitive types are passed by value since it's inception. It's a very good model in many ways.
But talking about swap, the best and easiest way to do a swap in Java/AS3 is with the following three lines:
var temp:int = array[i];
array[j] = array[i];
array[i] = temp;
Theres not really any reason to use a function to do a simple swap, when you can do it faster with just 3 lines.
It is annoying. But if you use different idioms than in e.g. C#, you can get reasonable-quality results. If you need to pass a lot of parameters back and forth, pass in an object filled with the needed data, and change the object's parameters when you return. The Object class is for just this sort of thing.
If you just need to return a bunch of data, return an Object. This is more in keeping with the ECMAScript style than pass-by-ref semantics.
Destructuring assignment (e.g. [a,b] = [b,a]) isn't defined in the ECMA-262 3 specification, and it's not implemented in JavaScript 1.5, which is the version equivalent to the JScript implementation in IE. I've seen this syntax in the AS4 specifications preview though, and I believe it's part of JavaScript 1.7.
If ActionScript works like Javascript,
[a,b] = [b,a]