I am currently trying to implement basic speech recognition in AS3. I need this to be completely client side, as such I can't access powerful server-side speech recognition tools. The idea I had was to detect syllables in a word, and use that to determine the word spoken. I am aware that this will grealty limit the capacities for recognition, but I only need to recognize a few key words and I can make sure they all have a different number of syllables.
I am currently able to generate a 1D array of voice level for a spoken word, and I can clearly see, if I somehow draw it, that there are distinct peaks for the syllables in most of the cases. However, I am completely stuck as to how I would find out those peaks. I only really need the count, but I suppose that comes with finding them. At first I thought of grabbing a few maximum values and comparing them with the average of values but I had forgot about that peak that is bigger than the others and as such, all my "peaks" were located on one actual peak.
I stumbled onto some Matlab code that looks almost too short to be true, but I can't very that as I am unable to convert it to any language I know. I tried AS3 and C#. So I am wondering if you guys could start me on the right path or had any pseudo-code for peak detection?
The matlab code is pretty straightforward. I'll try to translate it to something more pseudocodeish.
It should be easy to translate to ActionScript/C#, you should try this and post follow-up questions with your code if you get stuck, this way you'll have the best learning effect.
Param: delta (defines kind of a tolerance and depends on your data, try out different values)
min = Inf (or some very high value)
max = -Inf (or some very low value)
lookformax = 1
for every datapoint d [0..maxdata] in array arr do
this = arr[d]
if this > max
max = this
maxpos = d
endif
if this < min
min = this
minpos = d
endif
if lookformax == 1
if this < max-delta
there's a maximum at position maxpos
min = this
minpos = d
lookformax = 0
endif
else
if this > min+delta
there's a minimum at position minpos
max = this
maxpos = d
lookformax = 1
endif
endif
Finding peaks and valleys of a curve is all about looking at the slope of the line. At such a location the slope is 0. As i am guessing a voice curve is very irregular, it must first be smoothed, until only significant peaks exist.
So as i see it the curve should be taken as a set of points. Groups of points should be averaged to produce a simple smooth curve. Then the difference of each point should be compared, and points not very different from each other found and those areas identified as a peak, valleys or plateau.
If anyone wants the final code in AS3, here it is:
function detectPeaks(values:Array, tolerance:int):void
{
var min:int = int.MIN_VALUE;
var max:int = int.MAX_VALUE;
var lookformax:int = 1;
var maxpos:int = 0;
var minpos:int = 0;
for(var i:int = 0; i < values.length; i++)
{
var v:int = values[i];
if (v > max)
{
max = v;
maxpos = i;
}
if (v < min)
{
min = v;
minpos = i;
}
if (lookformax == 1)
{
if (v < max - tolerance)
{
canvas.graphics.beginFill(0x00FF00);
canvas.graphics.drawCircle(maxpos % stage.stageWidth, (1 - (values[maxpos] / 100)) * stage.stageHeight, 5);
canvas.graphics.endFill();
min = v;
minpos = i;
lookformax = 0;
}
}
else
{
if (v > min + tolerance)
{
canvas.graphics.beginFill(0xFF0000);
canvas.graphics.drawCircle(minpos % stage.stageWidth, (1 - (values[minpos] / 100)) * stage.stageHeight, 5);
canvas.graphics.endFill();
max = v;
maxpos = i;
lookformax = 1;
}
}
}
}
Related
I'm trying to execute the HPS algorithm and the results are not right. (48000Hz, 16bits)
I've applied to a buffer with the recorded frequency several splits, then a Hanning window, and finally the FFT.
I've obtained a peak in each FFT, that correspond with the frequency I am using, or an octave of it. But when i do the HPS, the results of the fundamental frequency are 0, because the numbers of the array where I make the sum(multiply) are too small, more than my peak in the original FFT.
This is the code of the HPS:
int i_max_h = 0;
double m_max_h = miniBuffer[0];
//m_max is the value of the peak in the original time domain array
m_max_h = m_max;
//array for the sum
double sum [] = new double[miniBuffer.length];
int fund_freq = 0;
//It could be divide by 3, but I'm not going over 500Hz, so it should works
for(int k = 0; k < 24000/48 ; k++)
{
//HPS down sampling and multiply
sum[k] = miniBuffer[k] * miniBuffer[2*k] * miniBuffer[3*k];
// find fundamental frequency (maximum value in plot)
if( sum[k] > m_max_h && k > 0 )
{
m_max_h = sum[k];
i_max_h = k;
}
}
//This should get the fundamental freq. from sum
fund_freq = (i_max_h * Fs / 24000);
System.out.print("Fundamental Freq.: ");
System.out.println(fund_freq);
System.out.println("");
The original HPS code is HERE
I don't know why the sum have little values, when it should be bigger than the previous, and the peak of the sum too. I've applied a RealFordward FFT, maybe there is a problem with the -1 to 1 range, that makes my sum decrease when I multiply it.
Any idea how to fix it, to do the HPS?
How could i do the inverse normalize?
The problem was that I was trying to get a higher value of amplitude on the sum array (the HPS array), and my set of values are normalize since I apply the FFT algorithm to them.
This is the solution I've created, multiplying the individual values of the sum array by 10 before make the multiply.
The number 10 is a coefficient that I have selected, but it could be wrong in some high frequencies cases, this coefficient could be another higher number.
'''
for(int k = 0; k < 24000/48 ; k++)
{
sum[k] = ((miniBuffer[k]*10) * (miniBuffer[2*k]*10) * (miniBuffer[3*k]*10));
// find fundamental frequency (maximum value in plot)
if( sum[k] > m_max_h && k > 0 )
{
m_max_h = sum[k];
i_max_h = k;
}
}
'''
The range of the frequencies is 24000/48 = 500, so it's between 0 and 499 Hz, more than I need in a bass.
If the split of the full array is less than 24000, i should decrease the number 48, and this is admissible, because the down sampled arrays are 24000/3 and 24000/2, so this value could decrease to 3, and it should work well.
I am using google sheets to log my water usage and want to find out how much I have used in a given period by using linear interpolation. I would really like to use a function similar to forecast, but instead of using the entire range to interpolate, just use the nearest points above and below.
I am keen to try and code it myself (have done lots of VBA) but don't know really where to start with google scripts. Does anyone have a starting point for me?
The process I would take is:
Interpolate(x, data_y, data_x)
// Check value is within range of known values (could expand function to use closet two values and extrapolate...)
(is X within XMin and XMax)
// Find closet X value below (X1), corresponding Y1
// Find closet X value above (X2), corresponding Y2
Return Y = Y1+(X-X1)*((Y2-Y1)/(X2-X1))
function interpolation(x_range, y_range, x_value) {
var xValue, yValue, xDiff, yDiff, xInt, index, check = 0;
if(x_value > Math.max.apply(Math, x_range) || x_value < Math.min.apply(Math, x_range)) {
throw "value can't be interpolated !!";
return;
}
for(var i = 0, iLen = x_range.length; i < iLen-1; i++) {
if((x_range[i][0] <= x_value && x_range[i+1][0]> x_value) || (x_range[i][0] >= x_value && x_range[i+1][0] < x_value)){
yValue = y_range[i][0];
xDiff = x_range[i+1][0] - x_range[i][0];
yDiff = y_range[i+1][0] - yValue;
xInt = x_value - x_range[i][0];
return (xInt * (yDiff / xDiff)) + yValue;
}
}
return y_range[x_range.length-1][0];
}
I want to traverse all the elements in the set Q = [0, 2^16) in a non sequential manner. To do so I need a function f(x) Q --> Q which gives the order in which the set will be sorted. for example:
f(0) = 2345
f(1) = 4364
f(2) = 24
(...)
To recover the order I would need the inverse function f'(x) Q --> Q which would output:
f(2345) = 0
f(4364) = 1
f(24) = 2
(...)
The function must be bijective, for each element of Q the function uniquely maps to another element of Q.
How can I generate such a function or are there any know functions that do this?
EDIT: In the following answer, f(x) is "what comes after x", not "what goes in position x". For example, if your first number is 5, then f(5) is the next element, not f(1). In retrospect, you probably thought of f(x) as "what goes in position x". The function defined in this answer is much weaker if used as "what goes in position x".
Linear congruential generators fit your needs.
A linear congruential generator is defined by the equation
f(x) = a*x+c (mod m)
for some constants a, c, and m. In this case, m = 65536.
An LCG has full period (the property you want) if the following properties hold:
c and m are relatively prime.
a-1 is divisible by all prime factors of m.
If m is a multiple of 4, a-1 is a multiple of 4.
We'll go with a = 5, c = 1.
To invert an LCG, we solve for f(x) in terms of x:
x = (a^-1)*(f(x) - c) (mod m)
We can find the inverse of 5 mod 65536 by the extended Euclidean algorithm, or since we just need this one computation, we can plug it into Wolfram Alpha. The result is 52429.
Thus, we have
f(x) = (5*x + 1) % 65536
f^-1(x) = (52429 * (x - 1)) % 65536
There's many approaches to solving this.
Since your set size is small, the requirement for generating the function and its inverse can simply be done via memory lookup. So once you choose your permutation, you can store the forward and reverse directions in lookup tables.
One approach to creating a permutation is mapping out all elements in an array and then randomly swapping them "enough" times. C code:
int f[PERM_SIZE], inv_f[PERM_SIZE];
int i;
// start out with identity permutation
for (i=0; i < PERM_SIZE; ++i) {
f[i] = i;
inv_f[i] = i;
}
// seed your random number generator
srand(SEED);
// look "enough" times, where we choose "enough" = size of array
for (i=0; i < PERM_SIZE; ++i) {
int j, k;
j = rand()%PERM_SIZE;
k = rand()%PERM_SIZE;
swap( &f[i], &f[j] );
}
// create inverse of f
for (i=0; i < PERM_SIZE; ++i)
inv_f[f[i]] = i;
Enjoy
I am trying to implement the fmincon function in MATLAB. I am getting a warning with an algorithm change to evaluate my function (warning shown at the end of post). I wanted to use fminsearch, but I have 2 constraints I need to follow. It doesn't make sense for MATLAB to change algorithms to evaluate my function because my constraints are very simple. I have provided the constraint and piece of code:
Constraints:
theta(0) + theta(1) < 1
theta(0), theta(1), theta(2), theta(3) > 0
% Solve MLE using fmincon
ret_1000 = returns(1:1000);
A = [1 1 0 0];
b = [.99999];
lb = [0; 0; 0; 0];
ub = [1; 1; 1; 1];
Aeq = [];
beq = [];
noncoln = [];
init_guess = [.2;.5; long_term_sigma; initial_sigma];
%option = optimset('FunValCheck', 1000);
options = optimset('fmincon');
options = optimset(options, 'MaxFunEvals', 10000);
[x, maxim] = fmincon(#(theta)Log_likeli(theta, ret_1000), init_guess, A, b, Aeq, beq, lb, ub, noncoln, options);
Warning:
Warning: The default trust-region-reflective algorithm does not solve problems with the constraints you
have specified. FMINCON will use the active-set algorithm instead. For information on applicable
algorithms, see Choosing the Algorithm in the documentation.
> In fmincon at 486
In GARCH_loglikeli at 30
Local minimum possible. Constraints satisfied.
fmincon stopped because the predicted change in the objective function
is less than the selected value of the function tolerance and constraints
are satisfied to within the selected value of the constraint tolerance.
<stopping criteria details>
No active inequalities.
All matlab variables are double my default. You can force a double using, double(variableName), you can get the type of a variable using class(variableName). I would use the class on all your variables to make sure they are what you expect. I don't have fmincon, but I tried a variant of your code on fminsearch and it worked like a charm:
op = optimset('fminsearch');
op = optimset(op,'MaxFunEvals',1000,'MaxIter',1000);
a = sqrt(2);
banana = #(x)100*(x(2)-x(1)^2)^2+(a-x(1))^2;
[x,fval] = fminsearch(banana, [-1.2, 1],op)
Looking at the matlab documentation, I think your input variables are not correct:
x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options)
I think you need:
% Let's be ultra specific to solve this syntax issue
fun = #(theta) Log_likeli(theta, ret_1000);
x0 = init_guess;
% A is defined as A
% b is defined as b
Aeq = [];
beq = [];
% lb is defined as lb
% ub is not defined, not sure if that's going to be an issue
% with the solver having lower, but not upper bounds probably isn't
% but thought it was worth a mention
ub = [];
nonlcon = [];
% options is defined as options
x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options)
So let's say i have T, T = 1200. I also have A, A is an array that contains 1000s of entries and these are numerical entries that range from 1000-2000 but does not include an entry for 1200.
What's the fastest way of finding the nearest neighbour (closest value), let's say we ceil it, so it'll match 1201, not 1199 in A.
Note: this will be run on ENTER_FRAME.
Also note: A is static.
It is also very fast to use Vector.<int>instead of Arrayand do a simple for-loop:
var vector:Vector.<int> = new <int>[ 0,1,2, /*....*/ 2000];
function seekNextLower( searchNumber:int ) : int {
for (var i:int = vector.length-1; i >= 0; i--) {
if (vector[i] <= searchNumber) return vector[i];
}
}
function seekNextHigher( searchNumber:int ) : int {
for (var i:int = 0; i < vector.length; i++) {
if (vector[i] >= searchNumber) return vector[i];
}
}
Using any array methods will be more costly than iterating over Vector.<int> - it was optimized for exactly this kind of operation.
If you're looking to run this on every ENTER_FRAME event, you'll probably benefit from some extra optimization.
If you keep track of the entries when they are written to the array, you don't have to sort them.
For example, you'd have an array where T is the index, and it would have an object with an array with all the indexes of the A array that hold that value. you could also put the closest value's index as part of that object, so when you're retrieving this every frame, you only need to access that value, rather than search.
Of course this would only help if you read a lot more than you write, because recreating the object is quite expensive, so it really depends on use.
You might also want to look into linked lists, for certain operations they are quite a bit faster (slower on sort though)
You have to read each value, so the complexity will be linear. It's pretty much like finding the smallest int in an array.
var closestIndex:uint;
var closestDistance:uint = uint.MAX_VALUE;
var currentDistance:uint;
var arrayLength:uint = A.length;
for (var index:int = 0; index<arrayLength; index++)
{
currentDistance = Math.abs(T - A[index]);
if (currentDistance < closestDistance ||
(currentDistance == closestDistance && A[index] > T)) //between two values with the same distance, prefers the one larger than T
{
closestDistance = currentDistance;
closestIndex = index;
}
}
return T[closestIndex];
Since your array is sorted you could adapt a straightforward binary search (such as explained in this answer) to find the 'pivot' where the left-subdivision and the right-subdivision at a recursive step bracket the value you are 'searching' for.
Just a thought I had... Sort A (since its static you can just sort it once before you start), and then take a guess of what index to start guessing at (say A is length 100, you want 1200, 100*(200/1000) = 20) so guess starting at that guess, and then if A[guess] is higher than 1200, check the value at A[guess-1]. If it is still higher, keep going down until you find one that is higher and one that is lower. Once you find that determine what is closer. if your initial guess was too low, keep going up.
This won't be great and might not be the best performance wise, but it would be a lot better than checking every single value, and will work quite well if A is evenly spaced between 1000 and 2000.
Good luck!
public function nearestNumber(value:Number,list:Array):Number{
var currentNumber:Number = list[0];
for (var i:int = 0; i < list.length; i++) {
if (Math.abs(value - list[i]) < Math.abs(value - currentNumber)){
currentNumber = list[i];
}
}
return currentNumber;
}