Third partial derivative approximations on uniform grid - numerical-methods

I have uniform grid and have to calculate third partial derivative approximations at nodes.
There I found approximations only for second order.
Could someone point me to or explain a way to build formula for third order partial derivatives.
Particularly, I have to calculate fxxx(x,y), fxxy(x,y), fyyy(x,y) and fyyx(x,y).
Many thanks.

Let's say that f[i,j] is the value at node (i,j), and h is the size of space step. You already know how to calculate second order derivatives of f, for example
fxx[i,j] = (f[i+1,j]-2*f[i,j]+f[i-1,j])/h^2
fyy[i,j] = (f[i,j+1]-2*f[i,j]+f[i,j-1])/h^2
fxy[i,j] = (f[i+1,j+1]-f[i+1,j-1]-f[i-1,j+1]+f[i-1,j-1])/h^2
These are of second degree of accuracy, that is the error is about h2. In order to maintain this accuracy, take one more derivative using the symmetric difference rule, like
gx[i,j] = (g[i+1,j]-g[i-1,j])/(2*h)
This results in:
fxx[i,j] = ((f[i+2,j]-2*f[i+1,j]+f[i,j])-(f[i,j]-2*f[i-1,j]+f[i-2,j]))/(2*h^3)
(which can be simplified), and similarly for other derivatives:
fxxy[i,j] = ((f[i+1,j+1]-2*f[i,j+1]+f[i-1,j+1])-(f[i+1,j-1]-2*f[i,j-1]+f[i-1,j-1]))/(2*h^3)

Related

Is it possible that the number of basic functions is more than the number of observations in spline regression?

I want to run regression spline with B-spline basis function. The data is structured in such a way that the number of observations is less than the number of basis functions and I get a good result.
But I`m not sure if this is the correct case.
Do I have to have more rows than columns like linear regression?
Thank you.
When the number of observations, N, is small, it’s easy to fit a model with basis functions with low square error. If you have more basis functions than observations, then you could have 0 residuals (perfect fit to the data). But that is not to be trusted because it may not be representative of more data points. So yes, you want to have more observations than you do columns. Mathematically, you cannot properly estimate more than N columns because of collinearity. For a rule of thumb, 15 - 20 observations are usually needed for each additional variable / spline.
But, this isn't always the case, such as in genetics when we have hundreds of thousands of potential variables and small sample size. In that case, we turn to tools that help with a small sample size, such as cross validation and bootstrap.
Bootstrap (ie resample with replacement) your datapoints and refit splines many times (100 will probably do). Then you average the splines and use these as the final spline functions. Or you could do cross validation, where you train on a smaller dataset (70%) and then test it on the remaining dataset.
In the functional data analysis framework, there are packages in R that construct and fit spline bases (such as cubic, B, etc). These packages include refund, fda, and fda.usc.
For example,
B <- smooth.construct.cc.smooth.spec(object = list(term = "day.t", bs.dim = 12, fixed = FALSE, dim = 1, p.order = NA, by = NA),data = list(day.t = 200:320), knots = list())
constructs a B spline basis of dimension 12 (over time, day.t), but you can also use these packages to help choose a basis dimension.

Backpropagation on Two Layered Networks

i have been following cs231n lectures of Stanford and trying to complete assignments on my own and sharing these solutions both on github and my blog. But i'm having a hard time on understanding how to modelize backpropagation. I mean i can code modular forward and backward passes but what bothers me is that if i have the model below : Two Layered Neural Network
Lets assume that our loss function here is a softmax loss function. In my modular softmax_loss() function i am calculating loss and gradient with respect to scores (dSoft = dL/dY). After that, when i'am following backwards lets say for b2, db2 would be equal to dSoft*1 or dW2 would be equal to dSoft*dX2(outputs of relu gate). What's the chain rule here ? Why isnt dSoft equal to 1 ? Because dL/dL would be 1 ?
The softmax function is outputs a number given an input x.
What dSoft means is that you're computing the derivative of the function softmax(x) with respect to the input x. Then to calculate the derivative with respect to W of the last layer you use the chain rule i.e. dL/dW = dsoftmax/dx * dx/dW. Note that x = W*x_prev + b where x_prev is the input to the last node. Therefore dx/dW is just x and dx/db is just 1, which means that dL/dW or simply dW is dsoftmax/dx * x_prev and dL/db or simply db is dsoftmax/dx * 1. Note that here dsoftmax/dx is dSoft we defined earlier.

Cubic spline implementation in Octave

My bold claim is that the Octave implementation of the cubic spline, as implemented in interp1(..., "spline") differs from the "natural cubic spline" algorithm outlined in, e.g., Wolfram's Mathworld. I have written my own implementation of the latter and compared it to the output of the interp1(..., "spline") function, with the following results:
I discovered that when I try the same comparison with 4 points, the solutions also differ, and, moreover, the Octave solution is identical to fitting a single cubic polynomial to all four points (and not actually producing a piecewise spline for the three intervals).
I also tried to look under the hood at Octave's implementation of splines, and found it was too obtuse to read and understand in 5 minutes.
I know that there are a few options for boundary conditions one can choose ("natural" vs "clamped") when implementing a cubic spline. My implementation uses "natural" boundary conditions (in which the second derivative of the two exterior points is set to zero).
If Octave's cubic spline is indeed different to the standard cubic spline, then what actually is it?
EDIT:
The second order differences of the two solutions shown in the Comparison plot above are plotted here:
Firstly, there appear to be only two cubic polynomials in Octave's case: one that is fit over the first two intervals, and one that is fit over the last two intervals. Secondly, they are clearly not using "natural" splines, since the second derivatives at the extremes do not tend to zero.
Also, I think the fact that the second order difference for my implementation at the middle (i.e. 3rd) point is zero is just a coincidence, and not demanded by the algorithm. Repeating this test for a different set of points will confirm/refute this.
Different end conditions explains the difference between your implementation and Octave's. Octave uses the not-a-knot condition (depending on input)
See help spline
To explain your observations: the third derivative is continuous at the 2nd and (n-1)th break due to the not-a-knot condition, so that's why Octave's second derivative looks like it has less 'breaks', because it is a continuous straight line over the first two and last two segments. If you look at the third derivative, you can see the effect more clearly - the 3rd derivative is discontinuous only at the 3rd break (the middle)
x = 1:5;
y = rand(1,5);
xx = linspace(1,5);
pp = interp1(x, y, 'spline', 'pp');
yy = ppval(pp, xx);
dyy = ppval(ppder(pp, 3), xx);
plot(xx, yy, xx, dyy);
Also the pp data structure looks like this
pp =
scalar structure containing the fields:
form = pp
breaks =
1 2 3 4 5
coefs =
0.427823 -1.767499 1.994444 0.240388
0.427823 -0.484030 -0.257085 0.895156
-0.442232 0.799439 0.058324 0.581864
-0.442232 -0.527258 0.330506 0.997395
pieces = 4
order = 4
dim = 1
orient = first

Generate Matrix from Another Matrix

Started learning octave recently. How do I generate a matrix from another matrix by applying a function to each element?
eg:
Apply 2x+1 or 2x/(x^2+1) or 1/x+3 to a 3x5 matrix A.
The result should be a 3x5 matrix with the values now 2x+1
if A(1,1)=1 then after the operation with output matrix B then
B(1,1) = 2.1+1 = 3
My main concern is a function that uses the value of x like that of finding the inverse or something as indicated above.
regards.
You can try
B = A.*2 + 1
The operator . means application of the following operation * to each element of the matrix.
You will find a lot of documentation for Octave in the distribution package and on the Web. Even better, you can usually also use the extensive documentation on Matlab.
ADDED. For more complex operations you can use arrayfun(), e.g.
B = arrayfun(#(x) 2*x/(x^2+1), A)

Using google maps API to find average speed at a location

I am trying to get the current traffic conditions at a particular location. The GTrafficOverlay object mentioned here only provides an overlay on an existing map.
Does anyone know how I can get this data from Google using their API?
It is only theorical, but there is perhaps a way to extract those data using the distancematrix api.
Method
1)
Make a topological road network, with node and edge, something like this:
Each edge will have four attributes: [EDGE_NUMBER;EDGE_SPEED;EDGE_TIME,EDGE_LENGTH]
You can use the openstreetmap data to create this network.
At the begining each edge will have the same road speed, for example 50km/h.
You need to use only the drivelink and delete the other edges. Take also into account that some roads are oneway.
2)
Randomly chose two nodes that are not closer than 5 or 10km
Use the dijsktra shortest path algorithm to calculate the shortest path between this two nodes (the cost = EDGE_TIME). Use your topological network to do that. The output will look like:
NODE = [NODE_23,NODE_44] PATH = [EDGE_3,EDGE_130,EDGE_49,EDGE_39]
Calculate the time needed to drive between the two nodes with the distance matrix api.
Preallocate a matrix A of size N X number_of_edge filled with zero value
Preallocate a matrix B of size 1 X number_of_edge filled with zero value
In the first row of matrix A fill each column (corresponding to each edge) with the length of the edge if the corresponding edge is in the path.
[col_1,col_2,col_3,...,col_39,...,col_49,...,col_130]
[0, 0, len_3,...,len_39,...,len_49,...,len_130] %row 1
In the first row of matrix B put the time calculated with the distance matrix api.
Then select two news node that were not used in the first path and repeat the operation until that there is no node left. (so you will fill the row 2, the row 3...)
Now you can solve the linear equation system: Ax = B where speed = 1/x
Assign the new calculated speed to each edge.
3)
Iterate the point 2) until your calculated speed start to converge.
Comment
I'm not sure that the calculated speed will converge, it will be interesting to test the method.I will try to do that if I got some time.
The distance matrix api don't provide a traveling time more precise than 1 minute, that's why the distance between the pair of node need to be at least 5 or 10 or more km.
Also this method fails to respect the Google's terms of service.
Google does not make available public API for this data.
Yahoo has a feed (example) with traffic conditions -- construction, accidents, and such. A write-up on how to access it is here.
If you want actual road speeds, you will probably need to work with a commercial provider.