Function that will not return 0 - function

I am writing a formula which to use as a decay multiplier on a given value.
The problem is the following : I have a window of processing - days lets say 10, this window is computed every day anew. I need to decay a certain parameter with a factor reflecting the days that an id is present. Currently what I do is (previousWinSize-(start of the current window-start of the previous window))/previousWinSize
In this case if my previous window size is 10 and the difference in the days of processing is two (10-2)/10 which gives me 0.8 to multiply my variable by and respectively decay .2 of it.
However if I have a 3 day window and again 2 days of difference (3-2)/3 I get value close to 0 which cuts more than I would like to.
I am looking for a formula that would scale better when the numbers are small and would not produce a huge decay factor.
Thank you in advance.

I recommend making use of a sigmoid function e.g.
You can take the output of your function i.e. returns a number between 0 and 1 based on the difference of days of processing and feed it into the sigmoid. If you set up the a (slope) and b (inflection point) parameters properly you can for example, ensure that the lowest decay multiplier you get is ~0.5 when your original equation returns a number close to 0.
I've graphed the example I stated above here:
https://www.desmos.com/calculator/nqemuexjhg
(This is based on: https://www.desmos.com/calculator/rna4aqta0c)
I think you do have two edge cases with this method though. When your equation returns 0 the sigmoid isn't exactly going to give you 0.5 (which you may not even want to begin with), you'll end up getting something that's close to 0.5. In this scenario what you may start to see is your values drifting if you keep applying the sigmoid. The same is true for when your equation returns 1. After putting it through the sigmoid you won't get 1, you'll get something close to 1.
What I think I'd do in such a scenario is have some sort of check before the sigmoid gets applied
e.g.
if(x == 0)
y = 0;
else if(x == 1)
y = 1;
else
y = sigmoid(x);
Sources / Possible further reading:
https://en.wikipedia.org/wiki/Sigmoid_function

Related

Octave FWHM calculation

I am having some problem about calculating the FWHM of my data. Because the "fwhm" function in signal package results in a 100 times bigger value than i expected to get.
What i did is that,
Depending on the gaussian distribution function (you can find it on wikipedia) I produced some data. In this function you can give a specific sigma (RMS) value (FWHM=sigma*2.355). Here is that the script I wrote to understand the situation
x=10:0.01:40;
x0=25;
sigma=0.25;
y=(1/(sigma*sqrt(2*pi)))*exp(-((x-x0).^2)/(2*sigma^2));
z=fwhm(y)/2.355;
plot(x,y)
when I compared the results the output of "fwhm" function (24.999) is 100 times bigger than the one I used (0.25) in the function.
If you have any idea it will be very helpful.
Thanks in advance.
Your z is 100 times bigger because your steps in x are 1/100 (0.01). If you use fwhm(y) it is expected that the stepsize in x is 1. If not you have to specify that.
In your case you should do:
z=fwhm(x, y)/2.355
z = 0.24999
which matches your sigma

Temperature Scale in SA

First, this is not a question about temperature iteration counts or automatically optimized scheduling. It's how the data magnitude relates to the scaling of the exponentiation.
I'm using the classic formula:
if(delta < 0 || exp(-delta/tK) > random()) { // new state }
The input to the exp function is negative because delta/tK is positive, so the exp result is always less then 1. The random function also returns a value in the 0 to 1 range.
My test data is in the range 1 to 20, and the delta values are below 20. I pick a start temperature equal to the initial computed temperature of the system and linearly ramp down to 1.
In order to get SA to work, I have to scale tK. The working version uses:
exp(-delta/(tK * .001)) > random()
So how does the magnitude of tK relate to the magnitude of delta? I found the scaling factor by trial and error, and I don't understand why it's needed. To my understanding, as long as delta > tK and the step size and number of iterations are reasonable, it should work. In my test case, if I leave out the extra scale the temperature of the system does not decrease.
The various online sources I've looked at say nothing about working with real data. Sometimes they include the Boltzmann constant as a scale, but since I'm not simulating a physical particle system that doesn't help. Examples (typically with pseudocode) use values like 100 or 1000000.
So what am I missing? Is scaling another value that I must set by trial and error? It's bugging me because I don't just want to get this test case running, I want to understand the algorithm, and magic constants mean I don't know what's going on.
Classical SA has 2 parameters: startingTemperate and cooldownSchedule (= what you call scaling).
Configuring 2+ parameters is annoying, so in OptaPlanner's implementation, I automatically calculate the cooldownSchedule based on the timeGradiant (which is a double going from 0.0 to 1.0 during the solver time). This works well. As a guideline for the startingTemperature, I use the maximum score diff of a single move. For more information, see the docs.

What does "step" mean in stepSimulation and what do its parameters mean in Bullet Physics?

What does the term "STEP" means in bullet physics?
What does the function stepSimulation() and its parameters mean?
I have read the documentation but i could not get hold of anything.
Any valid explanation would be of great help.
I know I'm late, but I thought the accepted answer was only marginally better than the documentation's description.
timeStep: The amount of seconds, not milliseconds, passed since the last call to stepSimulation.
maxSubSteps: Should generally stay at one so Bullet interpolates current values on its own. A value of zero implies a variable tick rate, meaning Bullet advances the simulation exactly timeStep seconds instead of interpolating. This feature is buggy and not recommended. A value greater than one must always satisfy the equation timeStep < maxSubSteps * fixedTimeStep or you're losing time in the simulation.
fixedTimeStep: Inversely proportional to the simulation's resolution. Resolution increases as this value decreases. Keep in mind that a higher resolution means it takes more CPU.
btDynamicsWorld::stepSimulation(
btScalar timeStep,
int maxSubSteps=1,
btScalar fixedTimeStep=btScalar(1.)/btScalar(60.));
timeStep - time passed after last simulation.
Internally simulation is done for some internal constant steps. fixedTimeStep
fixedTimeStep ~~~ 0.01666666 = 1/60
if timeStep is 0.1 then it will include 6 (timeStep / fixedTimeStep) internal simulations.
To make glider movements BulletPhysics interpolate final step results according reminder after division (timeStep / fixedTimeStep)
timeStep - the amount of time in seconds to step the simulation by. Typically you're going to be passing it the time since you last called it.
maxSubSteps - the maximum number of steps that Bullet is allowed to take each time you call it.
fixedTimeStep - regulates resolution of the simulation. If your balls penetrates your walls instead of colliding with them try to decrease it.
Here i would like to address the issue in Proxy's answer about special meaning of value 1 for maxSubSteps. There is only one special value, that is 0 and you most likely don't want to use it because then simulation will go with non-constant time step. All other values are the same. Let's have a look at the actual code:
if (maxSubSteps)
{
m_localTime += timeStep;
...
if (m_localTime >= fixedTimeStep)
{
numSimulationSubSteps = int(m_localTime / fixedTimeStep);
m_localTime -= numSimulationSubSteps * fixedTimeStep;
}
}
...
if (numSimulationSubSteps)
{
//clamp the number of substeps, to prevent simulation grinding spiralling down to a halt
int clampedSimulationSteps = (numSimulationSubSteps > maxSubSteps) ? maxSubSteps : numSimulationSubSteps;
...
for (int i = 0; i < clampedSimulationSteps; i++)
{
internalSingleStepSimulation(fixedTimeStep);
synchronizeMotionStates();
}
}
So, there is nothing special about maxSubSteps equal to 1. You should really abide this formula timeStep < maxSubSteps * fixedTimeStep if you don't want to lose time.

Function to dampen a value

I have a list of documents each having a relevance score for a search query. I need older documents to have their relevance score dampened, to try to introduce their date in the ranking process. I already tried fiddling with functions such as 1/(1+date_difference), but the reciprocal function is too discriminating for close recent dates.
I was thinking maybe a mathematical function with range (0..1) and domain(0..x) to amplify their score, where the x-axis is the age of a document. It's best to explain what I further need from the function by an image:
Decaying behavior is often modeled well by an exponentional function (many decaying processes in nature also follow it). You would use 2 positive parameters A and B and get
y(x) = A exp(-B x)
Since you want a y-range [0,1] set A=1. Larger B give slower decays.
If a simple 1/(1+x) decreases too quickly too soon, a sigmoid function like 1/(1+e^-x) or the error function might be better suited to your purpose. Let the current date be somewhere in the negative numbers for such a function, and you can get a value that is current for some configurable time and then decreases towards a base value.
log((x+1)-age_of_document)
Where the base of the logarithm is (x+1). Note the x is as per your diagram and is the "threshold". If the age of the document is greater than x the score goes negative. Multiply by the maximum possible score to introduce scaling.
E.g. Domain = (0,10) with a maximum score of 10: 10*(log(11-x))/log(11)
A bit late, but as thiton says, you might want to use a sigmoid function instead, since it has a "floor" value for your long tail data points. E.g.:
0.8/(1+5^(x-3)) + 0.2 - You can adjust the constants 5 and 3 to control the slope of the curve. The 0.2 is where the floor will be.

Determining the input of a function given an output (Calculus involved)

My Calculus teacher gave us a program on to calculate the definite integrals of a given interval using the trapezoidal rule. I know that programmed functions take an input and produce an output as arithmetic functions would but I don't know how to do the inverse: find the input given the output.
The problem states:
"Use the trapezoidal rule with varying numbers, n, of increments to estimate the distance traveled from t=0 to t=9. Find a number D for which the trapezoidal sum is within 0.01 unit of this limit (468) when n > D."
I've estimated the limit through "plug and chug" with the calculator and I know that with a regular algebraic function, I could easily do:
limit (468) = algebraic expression with variable x
(then solve for x)
However, how would I do this for a programmed function? How would I determine the input of a programmed function given output?
I am calculating the definite integral for the polynomial, (x^2+11x+28)/(x+4), between the interval 0 and 9. The trapezoidal rule function in my calculator calculates the definite integral between the interval 0 and 9 using a given number of trapezoids, n.
Overall, I want to know how to do this:
Solve for n:
468 = trapezoidal_rule(a = 0, b = 9, n);
The code for trapezoidal_rule(a, b, n) on my TI-83:
Prompt A
Prompt B
Prompt N
(B-A)/N->D
0->S
A->X
Y1/2->S
For(K,1,N-1,1)
X+D->X
Y1+S->S
End
B->X
Y1/2+S->S
SD->I
Disp "INTEGRAL"
Disp I
Because I'm not familiar with this syntax nor am I familiar with computer algorithms, I was hoping someone could help me turn this code into an algebraic equation or point me in the direction to do so.
Edit: This is not part of my homework—just intellectual curiosity
the polynomial, (x^2+11x+28)/(x+4)
This is equal to x+7. The trapezoidal rule should give exactly correct results for this function! I'm guessing that this isn't actually the function you're working with...
There is no general way to determine, given the output of a function, what its input was. (For one thing, many functions can map multiple different inputs to the same output.)
So, there is a formula for the error when you apply the trapezoidal rule with a given number of steps to a given function, and you could use that here to work out the value of n you need ... but (1) it's not terribly beautiful, and (2) it doesn't seem like a very reasonable thing to expect you to do when you're just starting to look at the trapezoidal rule. I'd guess that your teacher actually just wanted you to "plug and chug".
I don't know (see above) what function you're actually integrating, but let's pretend it's just x^2+11x+28. I'll call this f(x) below. The integral of this from 0 to 9 is actually 940.5. Suppose you divide the interval [0,9] into n pieces. Then the trapezoidal rule gives you: [f(0)/2 + f(1*9/n) + f(2*9/n) + ... + f((n-1)*9/n) + f(9)/2] * 9/n.
Let's separate this out into the contributions from x^2, from 11x, and from 28. It turns out that the trapezoidal approximation gives exactly the right result for the latter two. (Exercise: Work out why.) So the error you get from the trapezoidal rule is exactly the same as the error you'd have got from f(x) = x^2.
The actual integral of x^2 from 0 to 9 is (9^3-0^3)/3 = 243. The trapezoidal approximation is [0/2 + 1^2+2^2+...+(n-1)^2 + n^2/2] * (9/n)^2 * (9/n). (Exercise: work out why.) There's a standard formula for sums of consecutive squares: 1^2 + ... + n^2 = n(n+1/2)(n+1)/3. So our trapezoidal approximation to the integral of x^2 is (9/n)^3 times [(n-1)(n-1/2)n/3 + n^2/2] = (9/n)^3 times [n^3/3+1/6] = 243 + (9/n)^3/6.
In other words, the error in this case is exactly (9/n)^3/6 = (243/2) / n^3.
So, for instance, the error will be less than 0.01 when (243/2) / n^3 < 0.01, which is the same as n^3 > 100*243/2 = 12150, which is true when n >= 23.
[EDITED to add: I haven't checked any of my algebra or arithmetic carefully; there may be small errors. I take it what you're interested is the ideas rather than the specific numbers.]