Can someone explain the behavior of the functions mkpp and ppval? - function

If I do the following in MATLAB:
ppval(mkpp(1:2, [1 0 0 0]),1.5)
ans = 0.12500
This should construct a polynomial f(x) = x^3 and evaluate it at x = 1.5. So why does it give me the result 1.5^3 = .125? Now, if I change the domain defined in the first argument to mkpp, I get this:
> ppval(mkpp([1 1.5 2], [[1 0 0 0]; [1 0 0 0]]), 1.5)
ans = 0
So without changing the function, I change the answer. Awesome.
Can anyone explain what's going on here? How does changing the first argument to mkpp change the result I get?

The function MKPP will shift the polynomial so that x = 0 will start at the beginning of the corresponding range you give it. In your first example, the polynomial x^3 is shifted to the range [1 2], so if you want to evaluate the polynomial at an unshifted range of [0 1], you would have to do the following:
>> pp = mkpp(1:2,[1 0 0 0]); %# Your polynomial
>> ppval(pp,1.5+pp.breaks(1)) %# Shift evaluation point by the range start
ans =
3.3750 %# The answer you expect
In your second example, you have one polynomial x^3 shifted to the range [1 1.5] and another polynomial x^3 shifted to the range of [1.5 2]. Evaluating the piecewise polynomial at x = 1.5 gives you a value of zero, occurring at the start of the second polynomial.
It may help to visualize the polynomials you are making as follows:
x = linspace(0,3,100); %# A vector of x values
pp1 = mkpp([1 2],[1 0 0 0]); %# Your first piecewise polynomial
pp2 = mkpp([1 1.5 2],[1 0 0 0; 1 0 0 0]); %# Your second piecewise polynomial
subplot(1,2,1); %# Make a subplot
plot(x,ppval(pp1,x)); %# Evaluate and plot pp1 at all x
title('First Example'); %# Add a title
subplot(1,2,2); %# Make another subplot
plot(x,ppval(pp2,x)); %# Evaluate and plot pp2 at all x
axis([0 3 -1 8]) %# Adjust the axes ranges
title('Second Example'); %# Add a title

Related

Compare two linear regression models in MATLAB

I want to compare the performance of two models using the F statistic. Here is a reproducible example and the expected results:
load carbig
tbl = table(Acceleration,Cylinders,Horsepower,MPG);
% Testing separetly both models
mdl1 = fitlm(tbl,'MPG~1+Acceleration+Cylinders+Horsepower');
mdl2 = fitlm(tbl,'MPG~1+Acceleration');
% Comparing both models using the F-test and p-value
numerator = (mdl2.SSE-mdl1.SSE)/(mdl1.NumCoefficients-mdl2.NumCoefficients);
denominator = mdl1.SSE/mdl1.DFE;
F = numerator/denominator;
p = 1-fcdf(F,mdl1.NumCoefficients-mdl2.NumCoefficients,mdl1.DFE);
We end up with F = 298.75 and p = 0, indicating mdl1 is significantly better than mdl2, as assessed by the F statistic.
Is there anyway to obtain the F and p values without performing twice fitlm and doing all the computation?
I tried to run a coefTest, as suggested by #Glen_b, however the function is poorly documented and the results are not the ones I'm expecting.
[p,F] = coefTest(mdl1); % p = 0, F = 262.508 (this F test mdl1 vs constant mdl)
[p,F] = coefTest(mdl1,[0,0,1,1]); % p = 0, F = 57.662 (not sure what this is testing)
[p,F] = coefTest(mdl1,[1,1,0,0]); % p = 0, F = 486.810 (idem)
I believe I should carry the test with a different null hypothesis (C) using the function [p,F] = coeffTest(mdl1,H,C). But I don't really know how to do it and there's no example.
This answer is in regards to comparing two linear regression models where one model is a restricted version of the other.
Short answer:
To do an F-test on the restriction that the 3rd and 4th elements of your estimated, coefficient vector b are zero:
[p, F] = coefTest(mdl1, [0, 0, 1, 0; 0, 0, 0, 1]);
Further explanation:
Let b be our estimated vector. Linear restrictions on b are typically written in a matrix form: R*b = r. The restriction that 3rd and 4th element of b are zero would be written:
[0, 0, 1, 0 * b = [0
0, 0, 0, 1] 0];
The matrix [0, 0, 1, 0; 0, 0, 0, 1] is what coefTest calls the H matrix in the docs.
P = coefTest(M,H), with H a numeric matrix having one column for each
coefficient, performs an F test that H*B=0, where B represents the
coefficient vector.
Long version
Sometimes with this econometric routines, it's nice just to write it out yourself so you know what's really going on.
Remove rows with NaN because they just add unrelated complexity:
tbl_dirty = table(Acceleration,Cylinders,Horsepower,MPG);
tbl = tbl_dirty(~any(ismissing(tbl_dirty),2),:);
Do the estimation etc...
n = height(tbl); % number of observations
y = tbl.MPG;
X = [ones(n, 1), tbl.Acceleration, tbl.Cylinders, tbl.Horsepower];
k = size(X,2); % number of variables (including constant)
b = X \ y; % estimate b with least squares
u = y - X * b; % calculates residuals
s2 = u' * u / (n - k); % estimate variance of error term (assuming homoskedasticity, independent observations)
BCOV = inv(X'*X) * s2; % get covariance matrix of b assuming homoskedasticity of error term etc...
bse = diag(BCOV).^.5; % standard errors
R = [0, 0, 1, 0;
0, 0, 0, 1];
r = [0; 0]; % Testing restriction: R * b = r
num_restrictions = size(R, 1);
F = (R*b - r)'*inv(R * BCOV * R')*(R*b - r) / num_restrictions; % F-stat (see Hiyashi for reference)
Fp = 1 - fcdf(F, num_restrictions, n - k); % F p-val
For reference, can look at p. 65 of Hiyashi's book Econometrics.
No, there is not.
Fitlm fits an arbitrary model. In your case a regression model with an intercept and either one or three regressors. It might seem that the model with three regressors can use information from the model with one regressor, but this is only true if there are some restrictions on the model and even then this overlapping information is limited.
Fitlm is a very general framework which can be used for arbitrary models. Doing multiple regressions at the same time with sharing of information can thus get quite complex and is not implemented.
It is possible to implement this yourself for these two specific models. Usually such a linear regression is solved using the covariance matrix:
Beta = (X' X) ^-1 X' y
were X is the data with the variables as columns and y is the target variable. In this case you could reuse part of the covariance matrix for which you only need the columns from the smaller regression: the variation in Acceleration. Since adding 2 new variables adds 8 values yo the covariance matrix you only save 1/9 of the time. Furthermore, the heaviest part is the inversion. Thus the time improvement is very very little.
In short, just do two separate regressions

Subscript indices must be real positive integers or logicals

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
%
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCost) and gradient here.
%
hypothesis = x*theta;
theta_0 = theta(1) - alpha(1/m)*sum((hypothesis-y)*x);
theta_1 = theta(2) - alpha(1/m)*sum((hypothesis-y)*x);
theta(1) = theta_0;
theta(2) = theta_1;
% ============================================================
% Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta);
end
end
I keep getting this error
error: gradientDescent: subscript indices must be either positive integers less than 2^31 or logicals
on this line right in-between the first theta and =
theta_0 = theta(1) - alpha(1/m)*sum((hypothesis-y)*x);
I'm very new to octave so please go easy on me, and
thank you in advance.
This is from the coursera course on Machine Learning from Week 2
99% sure your error is on the line pointed out by topsig, where you have alpha(1/m)
it would help if you gave an example of input values to your function and what you hoped to see as an output, but I'm assuming from your comment
% taking num_iters gradient steps with learning rate alpha
that alpha is a constant, not a function. as such, you have the line alpha(1/m) without any operator in between. octave sees this as you indexing alpha with the value of 1/m.
i.e., if you had an array
x = [3 4 5]
x*(2) = [6 8 10] %% two times each element in the array
x(2) = [4] %% second element in the array
what you did doesn't seem to make sense, as 'm = length(y)' which will output a scalar, so
x = [3 4 5]; m = 3;
x*(1/m) = x*(1/3) = [1 1.3333 1.6666] %% element / 3
x(1/m) = ___error___ %% the 1/3 element in the array makes no sense
note that for certain errors it always indicates that the location of the error is at the assignment operator (the equal sign at the start of the line). if it points there, you usually have to look elsewhere in the line for the actual error. here, it was yelling at you for trying to apply a non-integer subscript (1/m)

Retrieving coefficients of polynomial from DFT using inverse DFT

I am trying to multiply two polynomials using DFT and I don't know how to get the last bit from the DFT of their multiplication.
So there's p(x) = x - 4, dft -3, i-4, -5, -i-4
And q(x) = x^2-1, dft 0, -2, 0, -2
degree(pq) = 3
So we get the 4th roots of unity 1, i, -1, -i
dft for pq is 0, 8-2i, 0, 8+2i.
Could someone please tell me how to get the coefficients for pq now from its dft?
Thanks!
The first thing to understand is that multiplying two polynomials is the same as convolving the coefficients.
octave:1> p=[0 0 1 -4];
octave:2> q=[0 1 0 -1];
octave:3> conv(p,q)
ans =
0 0 0 1 -4 -1 4
Secondly, understand the conditions under which circular convolution is equivalent to linear convolution.
(Also, your DFT coeffs seem to be wrong)

How does GNU Octave matrix division work? Getting unexpected behaviour.

In GNU Octave, How does matrix division work?
Instead of doing
1./[1;1]
I accidentally did
1/[1;1]
To my surprise this yields:
[0.5, 0.5]
The transverse case:
1/[1,1]
gives the expected:
error: operator /: nonconformant arguments (op1 is 1x1, op2 is 1x2)
Can someone explain the [0.5, 0.5] result?
Consider the following example:
>> A = [5 10];
>> B = [2 2];
If you want element-wise division, use A ./ B with the matrix size of both elements equal i.e If A is of size m∗n B must be of size m∗n
>> A ./B
ans =
2.5000 5.0000
If you want matrix by matrix division, use A / B with the matrix size of element A as m∗n and B as q∗n or m∗n. The / operator is trying to return x∗y−1 (i.e x * pinv(y) in octave format).
>> A / B
ans = 3.7500
which is same as
>> A * pinv(B)
ans = 3.7500
pinv() function in OCTAVE/MATLAB returns the Moore-Penrose pseudo inverse of matrix, whereas the inv() function returns the inverse of the matrix.
If you are confused about what to use, use pinv()
If you want further clarification about What is the difference between pinv and inv?
this is a answer i got from Alan Boulton at the coursera machine learning course discussion forum:
The gist of the idea is that x / y is defined quite generally so that it can deal with matrices. Conceptually the / operator is trying to return x∗y−1 (or x * inv(y) in Octave-speak), as in the following example:
octave:1> eye(2)/[1 2;3 4]
ans =
-2.00000 1.00000
1.50000 -0.50000
octave:2> inv([1 2;3 4])
ans =
-2.00000 1.00000
1.50000 -0.50000
The trickiness happens when y is a column vector, in which case the inv(y) is undefined, so pinv(y), the psuedoinverse of y, is used.
octave:1> pinv([1;2])
ans =
0.20000 0.40000
octave:2> 1/[1;2]
ans =
0.20000 0.40000
The vector y needs to be compatible with x so that x * pinv(y) is well-defined. So it's ok if y is a row vector, so long as x is compatible. See the following Octave session for illustration:
octave:18> pinv([1 2])
ans =
0.20000
0.40000
octave:19> 1/[1 2]
error: operator /: nonconformant arguments (op1 is 1x1, op2 is 1x2)
octave:19> eye(2)/[1 2]
ans =
0.20000
0.40000
octave:20> eye(2)/[1;2]
error: operator /: nonconformant arguments (op1 is 2x2, op2 is 2x1)
octave:20> 1/[1;2]
ans =
0.20000 0.40000
Matrix division with Octave explained:
A formal description of Octave Matrix Division from here
http://www.gnu.org/software/octave/doc/interpreter/Arithmetic-Ops.html
x / y
Right division. This is conceptually equivalent to the expression
(inverse (y') * x')'
But it is computed without forming the inverse of y'.
If the system is not square, or if the coefficient matrix is
singular, a minimum norm solution is computed.
What that means is that these two should be the same:
[3 4]/[4 5; 6 7]
ans =
1.50000 -0.50000
(inverse([4 5; 6 7]') * [3 4]')'
ans =
1.50000 -0.50000
First, understand that Octave matrix division is not commutative, just like matrix Multiplication is not commutative.
That means A / B does not equal B / A
1/[1;1]
ans =
0.50000 0.50000
[1;1]/1
ans =
1
1
One divided by a matrix with a single value one is one:
1/[1]
ans = 1
One divided by a matrix with a single value three is 0.33333:
1/[3]
ans = .33333
One divided by a (1x2) matrix:
1/[1;1]
ans =
0.50000 0.50000
Equivalent:
([1/2;1/2] * 1)'
ans =
0.50000 0.50000
Notice above, like the instructions said, we are taking the norm of the vector. So you see how the [1;1] was turned into [1/2; 1/2]. The '2' comes from the length of the vector, the 1 comes from the supplied vector. We'll do another:
One divided by a (1x3) matrix:
1/[1;1;1]
ans =
0.33333 0.33333 0.33333
Equivalent:
([1/3;1/3;1/3] * 1)'
ans =
0.33333 0.33333 0.33333
What if one of the elements are negative...
1/[1;1;-1]
ans =
0.33333 0.33333 -0.33333
Equivalent:
([1/3;1/3;-1/3] * 1)'
ans =
0.33333 0.33333 -0.33333
So now you have a general idea of what Octave is doing when you don't supply it a square matrix. To understand what Octave matrix division does when you pass it a square matrix you need to understand what the inverse function is doing.
I've been normalizing your vectors by hand, if you want octave to do them you can add packages to do so, I think the following package will do what I've been doing with the vector normalization:
http://octave.sourceforge.net/geometry/function/normalizeVector.html
So now you can convert the division into an equivalent multiplication. Read this article on how matrix multiplication works, and you can backtrack and figure out what is going on under the hood of a matrix division.
http://www.purplemath.com/modules/mtrxmult2.htm

Matlab plotting the shifted logistic function

I would like to plot the shifted logistic function as shown from Wolfram Alpha.
In particular, I would like the function to be of the form
y = exp(x - t) / (1 + exp(x - t))
where t > 0. In the link, for example, t is 6. I had originally tried the following:
x = 0:.1:12;
y = exp(x - 6) ./ (1 + exp(x - 6));
plot(x, y);
axis([0 6 0 1])
However, this is not the same as the result from Wolfram Alpha. Here is an export of my plot.
I do not understand what the difference is between what I am trying to do here vs. plotting shifted sin and cosine functions (which works using the same technique).
I am not completely new to Matlab but I do not usually use it in this way.
Edit: My values for x in the code should have been from 0 to 12.
fplot takes as inputs a function handle and a range to plot for:
>> fplot(#(x) exp(x-6) / (1 + exp(x-6)), [0 12])
The beauty of fplot in this case is you don't need to spend time calculating y-values beforehand; you could also extract values from the graph after the fact if you want (by getting the line handle's XData and YData properties).
Your input to Wolfram Alpha is incorrect. It is interpreted as e*(x-6)/(1-e*(x-6)). Use plot y = exp(x - 6) / (1 + exp(x - 6)) for x from 0 to 12 in Wolfram Alpha (see here) for the same results as in MATLAB. Also use axis([0 12 0 1]) (or no axis statement at all on a new plot) to see the full results in MATLAB.
In reply to your comment: use y = exp(1)*(x - 6) ./ (1 + exp(1)*(x - 6)); to do in MATLAB what you were doing in Wolfram Alpha.