represent letter trigram table in octave? - octave

I have a letter trigram table that associates a probability with each trigram of letters, such as 'thr'. I am not sure what the best way would be to represent such a table in octave, so that the look-ups are efficient.
Any suggestions?

You could use an associative-array-like data structure, aptly named struct().
% Create a new struct
trigrams = struct();
% Add an item explicitly, by using the format 'struct.key',
% where 'key' can be an arbitrary key
trigrams.length = 0;
trigrams.thr = 0.02;
trigrams.length += 1;
% Use setfield() when you don't know the key beforehand (e.g. if you're reading
% the values from a file, etc.)
trigramkey = 'hi';
trigrams = setfield(trigrams, trigramkey, 0.0007);
trigrams.length += 1;
% Likewise, use getfield() when you need a value dynamically
workingprob = getfield(trigrams, trigramkey);
% You can also check the existence of a key
hi_exists = isfield(trigrams, 'hi');
% By the way, you don't actually have to track the length like I've been doing
trigramlength = length(fieldnames(trigrams));
Note that the setfield() function is not in-place; it returns a new struct.

Related

Generate auto increment id from JSON schema faker

I'm looking for way to generate data by JSON schema faker js with IDs incremented from 0.
When I'm trying to use autoIncrement parameter in schema, I get valid values, but this auto increment is started from random number.
Is that possible to do that with this package?
I didn't find an official solution to the problem, but here is a workaround.
json-schema-faker's source code for generating auto-incremented integers (node_modules\json-schema-faker\lib\index.js) explains why it starts from a random integer:
// safe auto-increment values
container.define('autoIncrement', function (value, schema) {
if (!this.offset) {
var min = schema.minimum || 1;
var max = min + env.MAX_NUMBER;
this.offset = random$1.number(min, max);
}
if (value === true) {
return this.offset++;
}
return schema;
});
It is the if (!this.offset) branch that sets up the initial value. To achieve our goal, we can modify the code inside the branch like this:
if (!this.offset) {
var min = schema.minimum || 1;
// var max = min + env.MAX_NUMBER;
// this.offset = random$1.number(min, max);
this.offset = min;
}
When minimum is specified in the schema, its value will be used as the starting point. Otherwise, 1 is used instead.
It is also noteworthy that, if you specify minimum with an extremely large number, the auto-incrementation will no longer be "safe".
For anyone searching for a more current answer, you can now set an 'initialOffset' value within the schema which acts as a start value

Randomly selecting an object property

I guess a step back is in order. My original question is at the bottom of this post for reference.
I am writing a word guessing game and wanted a way to:
1. Given a word length of 2 - 10 characters, randomly generate a valid english word to guess
2.given a 2 - 10 character guess, ensure that it is a valid english word.
I created a vector of 9 objects, one for each word length and dynamically created 172000
property/ value pairs using the words from a word list to name the properties and setting their value to true. The inner loop is:
for (i = 0; i < _WordCount[wordLength] - 2; i)
{
_WordsList[wordLength]["" + _WordsVector[wordLength][i++]] = true;
}
To validate a word , the following lookup returns true if valid:
function Validate(key:String):Boolean
{
return _WordsList[key.length - 2][key]
}
I transferred them from a vector to objects to take advantage of the hash take lookup of the properties. Haven't looked at how much memory this all takes but it's been a useful learning exercise.
I just wasn't sure how best to randomly choose a property from one of the objects. I was thinking of validating whatever method I chose by generating 1000 000 words and analyzing the statistics of the distribution.
So I suppose my question should really first be am I better off using some other approach such as keeping the lists in vectors and doing a search each time ?
Original question
Newbie first question:
I read a thread that said that traversal order in a for.. in is determined by a hash table and appears random.
I'm looking for a good way to randomly select a property in an object. Would the first element in a for .. in traversing the properties, or perhaps the random nth element in the iteration be truly random. I'd like to ensure that there is approximately an equal probability of accessing a given property. The Objects have between approximately 100 and 20000 properties. Other approaches ?
thanks.
Looking at the scenario you described in your edited question, I'd suggest using a Vector.<String> and your map object.
You can store all your keys in the vector and map them in the object, then you can select a random numeric key in the vector and use the result as a key in the map object.
To make it clear, take a look at this simple example:
var keys:Vector.<String> = new Vector.<String>();
var map:Object = { };
function add(key:String, value:*):void
{
keys.push(key);
map[key] = value;
}
function getRandom():*
{
var randomKey = keys[int(Math.random() * keys.length)];
return map[randomKey];
}
And you can use it like this:
add("a", "x");
add("b", "y");
add("c", "z");
var radomValue:* = getRandom();
Using Object instead of String
Instead of storing the strings you can store objects that have the string inside of them,
something like:
public class Word
{
public var value:String;
public var length:int;
public function Word(value:String)
{
this.value = value;
this.length = value.length;
}
}
Use this object as value instead of the string, but you need to change your map object to be a Dictionary:
var map:Dictionary = new Dictionary();
function add(key:Word, value:*):void
{
keys.push(key);
map[key] = value;
}
This way you won't duplicate every word (but will have a little class overhead).

Matlab fminsearch options/restrictions

I have this function in Matlab which is supposed to find the smallest value possible for minValuePossible, by varying the two initial set values of inValues. How can I set the fmin search function to NOT try negative values while trying to find the minimum? Also how can I set the number of different variations the fminsearch function performs while trying to find the minimum? Because currently it tries somewhere around 20 different combinations of the two inValues and then completes. Maybe define the amount by which it changes each value? How would I do that?
function Valueminimiser
inValues = [50,50];
minValuePossible = fminsearch(#minimiser, inValues);
function result = minimiser(inValues)
x=inValues(1);
y=inValues(2);
RunMode = 2;
ValueOne = x;
ValueTwo = y;
[maxSCRAout] = main(RunMode,ValueOne,ValueTwo);
result = minValuePossible;
end
end
How can I set the fmin search function to NOT try negative values while trying to find the minimum?
Add the constrains of the values of your minimiser function at its beginning. If you meet this constrains then return a huge function value of minimizer. This will prevent fminsearch consider numbers which are not in your interest:
function result = minimiser(inValues)
if (sum(inValues < 0) > 1) % check if there is any negative number in input variable
result = hugeValue; % give a big value to the result
return; % return to fminsearch - do not execute the rest of the code
end
x=inValues(1);
y=inValues(2);
RunMode = 2;
ValueOne = x;
ValueTwo = y;
[maxSCRAout] = main(RunMode,ValueOne,ValueTwo);
result = minValuePossible;
Also how can I set the number of different variations the fminsearch function performs while trying to find the minimum?
You can define options of fminsearch by using optimset function. The parameter of optimset 'MaxFunEvals' is the maximum number of evaluations -- notice that this cout even the values you constrained, so maybe setting 'TolX' as advised by #slayton might be better if you are concerned about the accuarcy.
options = optimset('MaxFunEvals',numberOfVariations);
minValuePossible = fminsearch(#minimiser, inValues,options);
The docs for fminsearch don't describe a way to restrict the domain of the function you want to minimize.
If you want to restrict the range to all non-negative numbers then you can simply wrap your function in a call to abs, depending on the syntax .
minValuePossible = fminsearch( #(x)(minimiser( abs(x) ) ), inValues);
If you are worried about it constantly converging to the same minima then try a variety of different initial values.
Lastly you can alter the termination tolerances for X and minValuePossible using the TolX and TolFun input parameters. This is done with standard param value syntax: function(...., 'Param', value)
fminsearch( #(x)(minimiser(abs(x))), inValues, 'TolX', x_tolerance);

How to sort var length ids (composite string + numeric)?

I have a MySQL database whose keys are of this type:
A_10
A_10A
A_10B
A_101
QAb801
QAc5
QAc25
QAd2993
I would like them to sort first by the alpha portion, then by the numeric portion, just like above. I would like this to be the default sorting of this column.
1) how can I sort as specified above, i.e. write a MySQL function?
2) how can I set this column to use the sorting routine by default?
some constraints that might be helpful: the numeric portion of my ID's never exceeds 100,000. I use this fact in some javascript code to convert my ID's to strings concatenating the non-numeric portion with the (number + 1,000,000). (At the time I had not noticed the variations/subparts as above such as A_10A, A_10B, so I'll have to revamp that part of my code.)
The best way to achieve what you want is to store each part in its own column, and I would strongly recommend to change table structure. If it's impossible, you can try the following:
Create 3 UDFs which returns prefix, numeric part, and postfix of your string. For a better performance they should be native (Mysql, as any other RDMS, is not really good in complex string parsing). Then you can call these functions in ORDER BY clause or in trigger body which validates your column. In any case, it will work slower than if you create 3 columns.
No simple answer that I know of. I had something similar a while back but had to use jQuery to sort it. So what I did was first get the output into an javascript array. Then you may want to insert a zero padding to your numbers. Separate the Alpha from Nummerics using a regex, then reassemble the array:
var zarr = new Array();
for(var i=0; i<val.length; i++){
var chunk = val[i].match(/(\d+|[^\d]+)/g).join(',');
var chunks = chunk.split(",");
for(var s=0; s<chunks.length; s++){
if(isNaN(chunks[s]) == true)
zarr.push(chunks[s]);
else
zarr.push(zeroPad(chunks[s], 5));
}
}
function zeroPad(num,count){
var numZeropad = num + '';
while(numZeropad.length < count) {
numZeropad = "0" + numZeropad;
}
return numZeropad;
}
You'll end up with an array like this:
A_00100
QAb00801
QAc00005
QAc00025
QAd02993
Then you can do a natural sort. I know you may want to do it through straight MySQL but I am not to sure if it does natural sorting.
Good luck!

function to return index of largest neighbor

F# function
Problem:
given a list of items e.g.:
["5";"10";"2";"53";"4"]
and a Search Index, I require a function such that it compares the current given index against its neighbor, returning the largest index
Example:
Given Index 1 will return Index value 2 (because 10 is greater than 5).
Given Index 4 will return Index 4 (because 53 is greater than 4)
Currently this is my function. It does not compile:
let GetMaxNode (x:Array) Idx = if x.[Idx] > x.[Idx+1] then Idx else If x.[Idx] < x.[Idx+1] then Idx+1
The errors I'm getting for all the x' are:
The field, constructor or member 'Item' is not defined (FS0039)
And also the second If:
The value or constructor 'If' is not defined (FS0039)
I suspect I'm still thinking in a procedural way, I was thinking about using pattern matching, however I was not confident enough with the syntax to try it.
Please can you also explain the answer as well, as I'm trying to learn F#, just the solution will not help me much.
Here's some code based on yours:
let GetMaxNode (x:_[]) idx =
if x.[idx] > x.[idx+1] then
idx
elif x.[idx] < x.[idx+1] then
idx+1
else
idx // same, return this one
The main changes are
to declare an array type, say <typename> []. In this case, we don't care about the type, so I use _ as a "don't care, please go infer the right thing for me" type variable.
"else if" is spelled elif in F#
need an else case for if equal
It is difficult to write solution to your problem in a functional style, because your problem is defined in terms of indices - when using functional data structures, such as lists, you don't usually refer to the elements by their index.
A functional version of your question would be, for example, to create a list that contains true when the element at the current position is larger than the next one and false when it is smaller. For your data this would give:
let data = [ 5; 10; 2; 53; 4 ]
let res = [ false; true; false; true; ] // no item to compare '4' with
This can be solved quite nicely using a recursive function that walks through the list and pattern matching (because pattern matching works much better with functional lists than with arrays)
let rec getMaxNodes data =
match data with
// list has at least two elements and current is larger
| current::next::other when current >= next ->
// process the rest of the list
let rest = (getMaxNodes (next::other))
// return 'true' followed by recursively processed rest of the list
true::rest
// list has at least two elements and current is smaller
| current::next::rest ->
// same as the previous case, but we return false
false::(getMaxNodes (next::rest))
| _ ->
// one element (so we cannot compare it with the next one)
// or empty list, so we return empty list
[]
getMaxNodes data
Here's the pattern matching version of Brian's answer.
let GetMaxNode (x:_[]) idx =
match idx with
| idx when x.[idx] > x.[idx+1] -> idx
| idx when x.[idx] < x.[idx+1] -> idx + 1
| idx -> idx // same, return this one
You may also see a syntax shortcut as you look at more F# code. The below code is functionally exactly the same as the above code.
let GetMaxNode (x:_[]) = function
| idx when x.[idx] > x.[idx+1] -> idx
| idx when x.[idx] < x.[idx+1] -> idx + 1
| idx -> idx // same, return this one
Whenever you start talking about indices, you are best sticking with Arrays or ResizeArrays; F# lists are not well-suited for operations on indices since they are singly-linked head to tail. That being said, it is not too difficult to write this algorithm in a purely functional way by moving through the list using a recursive loop and keeping track of the current index and current element.
let find elements index =
//a local tail-recursive function hides implementation details
//(cur::tail) is a pattern match on the list, i is the current index position
let rec loop (cur::tail) i =
if i = index then //when the current index matches the search index
if cur >= tail.Head then i //compare cur to tail.Head (which is next)
else (i+1)
else loop tail (i+1) //else continue
loop elements 0 //the entry point of loop and our return value
Use a list of ints instead of strings to get the results you expect (since "10" is actually less than "5"):
> let x = [5;10;2;53;4];;
> find x 0;;
val it : int = 1
> find x 1;;
val it : int = 1
> find x 2;;
val it : int = 3
> find x 3;;
val it : int = 3