How to find a function that fits a given set of data points in Julia? - function

So, I have a vector that corresponds to a given feature (same dimensionality). Is there a package in Julia that would provide a mathematical function that fits these data points, in relation to the original feature? In other words, I have x and y (both vectors) and need to find a decent mapping between the two, even if it's a highly complex one. The output of this process should be a symbolic formula that connects x and y, e.g. (:x)^3 + log(:x) - 4.2454. It's fine if it's just a polynomial approximation.
I imagine this is a walk in the park if you employ Genetic Programming, but I'd rather opt for a simpler (and faster) approach, if it's available. Thanks

Turns out the Polynomials.jl package includes the function polyfit which does Lagrange interpolation. A usage example would go:
using Polynomials # install with Pkg.add("Polynomials")
x = [1,2,3] # demo x
y = [10,12,4] # demo y
polyfit(x,y)
The last line returns:
Poly(-2.0 + 17.0x - 5.0x^2)`
which evaluates to the correct values.
The polyfit function accepts a maximal degree for the output polynomial, but defaults to using the length of the input vectors x and y minus 1. This is the same degree as the polynomial from the Lagrange formula, and since polynomials of such degree agree on the inputs only if they are identical (this is a basic theorem) - it can be certain this is the same Lagrange polynomial and in fact the only one of such a degree to have this property.
Thanks to the developers of Polynomial.jl for leaving me just to google my way to an Answer.

Take a look to MARS regression. Multi adaptive regression splines.

Related

Finding roots of a polynomial without writing it in the form of a matrix

Is there any method to find the root of a polynomial, not in matrix form, in MATLAB?
I know, to find roots of a polynomial (say, p(x) = x^.2 - 4), I should do the following:
p = [1 0 -4];
r = roots(p)
What I wanted to know if there is some way to find the root of a function (say p(x) = x^.2 - 4) already present in polynomial form (not in matrix form) in my matlab code? Like anything similar to r = roots(p(x)) (this doesn't work, of course).
Root is good
First of all the solution using roots is probably the one that will give you the most accurate and fastest results if you are indeed working with polynomials. I will acknowledge that it might be an issue if your function is not a polynomial.
Finding the roots of a function
If you don't want to use roots that means you will probably have to represent your polynomial as an anonymous function. Then you can use any root-finding algorithm on that function. Wikipedia has a few of them listed. What is tricky is that in general they don't guarantee that they will find one root, let alone all of them. So you might need as much prior information on your function as you can.
In matlab you can use fzero. The issue with it is that it only finds one zero and that it will only find zeros where the function changes sign (it wouldn't work on p(x) = x² for example). This is how you would implement it:
p = #(x) x.^2 - 4; % Define your polynomial as an anonymous function
x0 = 12; % Initial guess for the zero
% Find a root
fzero(p, x0)
>>> ans = 2
% Now with a different initial guess for a different solution
x0 = -12;
fzero(p, x0)
>>> ans = -2
As you can see this works only if you want to find a root and don't care which one it is.
Problem
The issue is that you polynomials with integer or rational coefficients have a way of finding the roots by using square-free factorization. Yet you can only apply that if you have some way of storing and accessing those coefficients in matlab. The anonymous functions don't allow you to do that. That's why roots works with a matrix and not an anonymous function.

Backpropagation on Two Layered Networks

i have been following cs231n lectures of Stanford and trying to complete assignments on my own and sharing these solutions both on github and my blog. But i'm having a hard time on understanding how to modelize backpropagation. I mean i can code modular forward and backward passes but what bothers me is that if i have the model below : Two Layered Neural Network
Lets assume that our loss function here is a softmax loss function. In my modular softmax_loss() function i am calculating loss and gradient with respect to scores (dSoft = dL/dY). After that, when i'am following backwards lets say for b2, db2 would be equal to dSoft*1 or dW2 would be equal to dSoft*dX2(outputs of relu gate). What's the chain rule here ? Why isnt dSoft equal to 1 ? Because dL/dL would be 1 ?
The softmax function is outputs a number given an input x.
What dSoft means is that you're computing the derivative of the function softmax(x) with respect to the input x. Then to calculate the derivative with respect to W of the last layer you use the chain rule i.e. dL/dW = dsoftmax/dx * dx/dW. Note that x = W*x_prev + b where x_prev is the input to the last node. Therefore dx/dW is just x and dx/db is just 1, which means that dL/dW or simply dW is dsoftmax/dx * x_prev and dL/db or simply db is dsoftmax/dx * 1. Note that here dsoftmax/dx is dSoft we defined earlier.

How to Solve non-specific non-linear equations?

I am attempting to fit a circle to some data. This requires numerically solving a set of three non-linear simultaneous equations (see the Full Least Squares Method of this document).
To me it seems that the NEWTON function provided by IDL is fit for solving this problem. NEWTON requires the name of a function that will compute the values of the equation system for particular values of the independent variables:
FUNCTION newtfunction,X
RETURN, [Some function of X, Some other function of X]
END
While this works fine, it requires that all parameters of the equation system (in this case the set of data points) is hard coded in the newtfunction. This is fine if there is only one data set to solve for, however I have many thousands of data sets, and defining a new function for each by hand is not an option.
Is there a way around this? Is it possible to define functions programmatically in IDL, or even just pass in the data set in some other manner?
I am not an expert on this matter, but if I were to solve this problem I would do the following. Instead of solving a system of 3 non-linear equations to find the three unknowns (i.e. xc, yc and r), I would use an optimization routine to converge to a solution by starting with an initial guess. For this steepest descent, conjugate gradient, or any other multivariate optimization method can be used.
I just quickly derived the least square equation for your problem as (please check before use):
F = (sum_{i=1}^{N} (xc^2 - 2 xi xc + xi^2 + yc^2 - 2 yi yc + yi^2 - r^2)^2)
Calculating the gradient for this function is fairly easy, since it is just a summation, and therefore writing a steepest descent code would be trivial, to calculate xc, yc and r.
I hope it helps.
It's usual to use a COMMON block in these types of functions to pass in other parameters, cached values, etc. that are not part of the calling signature of the numeric routine.

Numerical integration of a discontinuous function in multiple dimensions

I have a function f(x) = 1/(x + a+ b*I*sign(x)) and I want to calculate the
integral of
dx dy dz f(x) f(y) f(z) f(x+y+z) f(x-y - z)
over the entire R^3 (b>0 and a,- b are of order unity). This is just a representative example -- in practice I have n<7 variables and 2n-1 instances of f(), n of them involving the n integration variables and n-1 of them involving some linear combintation of the integration variables. At this stage I'm only interested in a rough estimate with relative error of 1e-3 or so.
I have tried the following libraries :
Steven Johnson's cubature code: the hcubature algorithm works but is abysmally slow, taking hundreds of millions of integrand evaluations for even n=2.
HintLib: I tried adaptive integration with a Genz-Malik rule, the cubature routines, VEGAS and MISER with the Mersenne twister RNG. For n=3 only the first seems to be somewhat viable option but it again takes hundreds of millions of integrand evaluations for n=3 and relerr = 1e-2, which is not encouraging.
For the region of integration I have tried both approaches: Integrating over [-200, 200]^n (i.e. a region so large that it essentially captures most of the integral) and the substitution x = sinh(t) which seems to be a standard trick.
I do not have much experience with numerical analysis but presumably the difficulty lies in the discontinuities from the sign() term. For n=2 and f(x)f(y)f(x-y) there are discontinuities along x=0, y=0, x=y. These create a very sharp peak around the origin (with a different sign in the various quadrants) and sort of 'ridges' at x=0,y=0,x=y along which the integrand is large in absolute value and changes sign as you cross them. So at least I know which regions are important. I was thinking that maybe I could do Monte Carlo but somehow "tell" the algorithm in advance where to focus. But I'm not quite sure how to do that.
I would be very grateful if you had any advice on how to evaluate the integral with a reasonable amount of computing power or how to make my Monte Carlo "idea" work. I've been stuck on this for a while so any input would be welcome. Thanks in advance.
One thing you can do is to use a guiding function for your Monte Carlo integration: given an integral (am writing it in 1D for simplicity) of ∫ f(x) dx, write it as ∫ f(x)/g(x) g(x) dx, and use g(x) as a distribution from which you sample x.
Since g(x) is arbitrary, construct it such that (1) it has peaks where you expect them to be in f(x), and (2) such that you can sample x from g(x) (e.g., a gaussian, or 1/(1+x^2)).
Alternatively, you can use a Metropolis-type Markov chain MC. It will find the relevant regions of the integrand (almost) by itself.
Here are a couple of trivial examples.

"Reverse" statistics: generating data based on mean and standard deviation

Having a dataset and calculating statistics from it is easy. How about the other way around?
Let's say I know some variable has an average X, standard deviation Y and assume it has normal (Gaussian) distribution. What would be the best way to generate a "random" dataset (of arbitrary size) which will fit the distribution?
EDIT: This kind of develops from this question; I could make something based on that method, but I am wondering if there's a more efficient way to do it.
You can generate standard normal random variables with the Box-Mueller method. Then to transform that to have mean mu and standard deviation sigma, multiply your samples by sigma and add mu. I.e. for each z from the standard normal, return mu + sigma*z.
This is really easy to do in Excel with the norminv() function. Example:
=norminv(rand(), 100, 15)
would generate a value from a normal distribution with mean of 100 and stdev of 15 (human IQs). Drag this formula down a column and you have as many values as you want.
I found a page where this problem is solved in several programming languages:
http://rosettacode.org/wiki/Random_numbers
There are several methods to generate Gaussian random variables. The standard method is Box-Meuller which was mentioned earlier. A slightly faster version is here:
http://en.wikipedia.org/wiki/Ziggurat_algorithm
Here's the wikipedia reference on generating Gaussian variables
http://en.wikipedia.org/wiki/Normal_distribution#Generating_values_from_normal_distribution
I'll give an example using R and the 2nd algorithm in the list here.
X<-4; Y<-2 # mean and std
z <- sapply(rep(0,100000), function(x) (sum(runif(12)) - 6) * Y + X)
plot(density(z))
> mean(z)
[1] 4.002347
> sd(z)
[1] 2.005114
> library(fUtilities)
> skewness(z,method ="moment")
[1] -0.003924771
attr(,"method")
[1] "moment"
> kurtosis(z,method ="moment")
[1] 2.882696
attr(,"method")
[1] "moment"
You could make it a kind of Monte Carlo simulation. Start with a wide random "acceptable range" and generate a few truly random values. Check your statistics and see if the average and variance are off. Adjust the "acceptable range" for the random values and add a few more values. Repeat until you have hit both your requirements and your population sample size.
Just off the top of my head, let me know what you think. :-)
The MATLAB function normrnd from the Statistics Toolbox can generate normally distributed random numbers with a given mu and sigma.
It is easy to generate dataset with normal distribution (see http://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform ).
Remember that generated sample will not have exact N(0,1) distribution! You need to standarize it - substract mean and then divide by std deviation. Then You are free to transform this sample to Normal distribution with given parameters: multiply by std deviation and then add mean.
Interestingly numpy has a prebuilt function for that:
import numpy as np
def generate_dataset(mean, std, samples):
dataset = np.random.normal(mean, std, samples)
return dataset