Why that I have written for Andrew Ng's course not accepted? - octave

Andrew Ng's course in Coursera, which Stanford's Machine Learning course, features programming assignments that deal with implementing the algorithms taught in class. The goal of this assignment is to implement linear regression through gradient descent with an input set of X, y, theta, alpha (learning rate), and number of iterations.
I implemented this solution in Octave, the prescribed language in the course.
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y);
J_history = zeros(num_iters, 1);
numJ = size(theta, 1);
for iter = 1:num_iters
for i = 1:m
for j = 1:numJ
temp = theta(j) - alpha /m * X(i, j) * (((X * theta)(i, 1)) - y(i, 1));
theta(j) = temp
end
prediction = X * theta;
J_history(iter, 1) = computeCost(X,y,theta)
end
end
On the other hand, here is the cost function:
function J = computeCost(X, y, theta)
m = length(y);
J = 0;
prediction = X * theta;
error = (prediction - y).^2;
J = 1/(2 * m) .* sum(error);
end
This does not pass the submit() function. The submit() function simply validates the data through passing an unknown test case.
I have checked other questions on StackOverflow but I really don't get it. :)
Thank you very much!

Your gradient seems to be correct and as already pointed out in the answer given by #Kasinath P, it is likely that the problem is that the code is too slow. You just need to vectorize it. In Matlab/Octave, you usually need to avoid for loops (note that although you have parfor in Matlab, it is not yet available in octave). So it is always better, performance-wise, to write something like A*x instead of iterating over each row of A with a for loop. You can read about vectorization here.
If I understand correctly, X is a matrix of size m*numJ where m is the number of examples, and numJ is the number of features (or the dimension of the space where each point lies. In that case, you can rewrite your cost function as
(1/(2*m)) * (X*theta-y)'*(X*theta-y);%since ||v||_2^2=v'*v for any vector v in Euclidean space
Now, we know from basic matrix calculus that for any two vectors s and v that are functions from R^{num_J} to R^m, the Jacobian of s^{t}v is given by
s^{t}Jacobian(v)+v^{t}*Jacobian(s) %this Jacobian will have size 1*num_J.
Applying that to your cost function, we obtain
jacobian=(1/m)*(theta'*X'-y')*X;
So if you just replace
for i = 1:m
for j = 1:numJ
%%% theta(j) updates
end
end
with
%note that the gradient is the transpose of the Jacobian we've computed
theta-=alpha*(1/m)*X'*(X*theta-y)
you should see a great increase in performance.

your computecost code is correct
and Better follow the vectorized implementation of Gradient Descent.
You are just iterating and it is slow and may have error.
That course aims you to do vectorized implementation as it is simple and handy at the same time.
I knew this because I did that after sweating a lot.
Vectorization is good:)

Related

administer drug dose at certain time step in octave

I am doing a simple task on ocatve.
I have to administer drug dose at day 1,7,21 and 28..
i wrote a function like that:
function xdot = f (x,t)
a=[1;7;21;28]
drug=0; ### initially drug is zero
if (t==a)
drug=drug+57.947;
else
drug=drug+0;
endif
xdot=(-0.4077)*(x)+ drug; passing the value of drug to differential equation
endfunction
In the main file i called this function in lsode:
t=linspace(0,30,30);
x0=0;
y=lsode(#ex,x0,t); ### ex is the file name where function is written
plot(t,y,'o')
This program doesn't work.. it displays all the time zero value for drug. Can anybody help me that how to administer dug dose with certain time step by manipulating linspace function.
It looks like you have a simple clearance model, and at each time in a, you want the dose to be delivered instantaneously. That is, at each time in a, the amount of drug in the subject increases by 57.947.
If that is the model that you have in mind, implementing it in the formula for xdot will not work well. You would actually need to implement it as a "delta function", and lsode won't work with that.
Instead, I recommend solving the problem in stages, corresponding to the time intervals [0, 1], [1, 7], [7, 21], and [21, 28]. During one stage, the differental equation is simply xdot = -0.4077*x. In the first stage, the initial condition is 0. In the next stage, the initial condition is the final value of the previous stage plus the dosage amount 57.947.
Here's a script:
dose = 57.947;
a = [1 7 21 28 30];
x0 = 0.0;
t0 = 0;
t = [];
y = [];
for t1 = a
tinterval = linspace(t0, t1, 2*(t1 - t0) + 1);
yinterval = lsode(#(x, t) -0.4077*x, x0, tinterval);
t = [t tinterval];
y = [y yinterval'];
t0 = t1;
x0 = yinterval(end) + dose;
end
plot(t, y, "linewidth", 2);
The script creates this plot:
Note that the differential equation xdot = -k*x has the solution x0*exp(-k*(t-t0)), so the call to lsode could be replaced with
yinterval = x0*exp(-0.4077*(tinterval - t0));
If you do this, also remove the transpose from yinterval two lines below:
y = [y yinterval];
If you want to keep the implementation of the administration of the drug dose within the formula for xdot, you'll need to distribute it over a small time interval. It could be implemented as a short rectangular pulse of width w and height 57.974/w. You'll also need to ensure that lsode takes sufficiently small internal time steps (less than w) so that it "sees" the drug dose.
You probably want replace t==a by ismember (t, a). Also, you should erase the else clause as it has no effect on the answer.
UPDATE
Consider rewriting your function as:
function xdot = f (x,t)
xdot = -0.4077*x + 57.947*ismember (t, [1 7 21 28])
endfunction

Defining a Differential Equation in Octave

I am attempting to use Octave to solve for a differential equation using Euler's method.
The Euler method was given to me (and is correct), which works for the given Initial Value Problem,
y*y'' + (y')^2 + 1 = 0; y(1) = 1;
That initial value problem is defined in the following Octave function:
function [YDOT] = f(t, Y)
YDOT(1) = Y(2);
YDOT(2) = -(1 + Y(2)^2)/Y(1);
The question I have is about this function definition. Why is YDOT(1) != 1? What is Y(2)?
I have not found any documentation on the definition of a function using function [YDOT] instead of simply function YDOT, and I would appreciate any clarification on what the Octave code is doing.
First things first: You have a (non linear) differential equation of order two which will require you to have two initial conditions. Thus the given information from above is not enough.
The following is defined for further explanations: A==B means A is identical to B; A=>B means B follows from A.
It seems you are mixing a few things. The guy who gave you the files rewrote the equation in the following way:
y*y'' + (y')^2 + 1 = 0; y(1) = 1; | (I) y := y1 & (II) y' := y2
(I) & (II)=>(III): y' = y2 = y1' | y2==Y(2) & y1'==YDOT(1)
Ocatve is "matrix/vector oriented" so we are writing everything in vectors or matrices. Rather writing y1=alpha and y2=beta we are writing y=[alpha; beta] where y(1)==y1=alpha and y(2)==y2=beta. You will soon realize the tremendous advantage of using especially this mathematical formalization for ALL of your problems.
(III) & f(t,Y)=>(IV): y2' == YDOT(2) = y'' = (-1 -(y')^2) / y
Now recall what is y' and y from the definitions in (I) and (II)!
y' = y2 == Y(2) & y = y1 == Y(1)
So we can rewrite equation (IV)
(IV): y2' == YDOT(2) = (-1 -(y')^2) / y == -(1 + Y(2)^2)/Y(1)
So from equation (III) and (IV) we can derive what you already know:
YDOT(1) = Y(2)
YDOT(2) = -(1 + Y(2)^2)/Y(1)
These equations are passed to the solver. Differential equations of all types are solved numerically by retrieving the "next" value in a near neighborhood to some "previously known" value. (The step size inside this neighborhood is one of the key questions when writing solvers!) So your solver uses your initial condition Y(1)==y(1)=1 to make the next step and calculate the "next" value. So right at the start YDOT(1)=Y(2)==y(2) but you didn't tell us this value! But from then on YDOT(1) is varied by the solver in dependency to the function shape to solve your problem and give you ONE unique y(t) solution.
It seems you are using Octave for the first time so let's make a last comment on function [YDOT] = f(t, Y). In general a function is defined in this way:
function[retVal1, retVal2, ...] = myArbitraryName(arg1, arg2, ...)
Where retVal is the return value or output and arg is the argument or input.

How to compute Fourier coefficients with MATLAB

I'm trying to compute the Fourier coefficients for a waveform using MATLAB. The coefficients can be computed using the following formulas:
T is chosen to be 1 which gives omega = 2pi.
However I'm having issues performing the integrals. The functions are are triangle wave (Which can be generated using sawtooth(t,0.5) if I'm not mistaking) as well as a square wave.
I've tried with the following code (For the triangle wave):
function [ a0,am,bm ] = test( numTerms )
b_m = zeros(1,numTerms);
w=2*pi;
for i = 1:numTerms
f1 = #(t) sawtooth(t,0.5).*cos(i*w*t);
f2 = #(t) sawtooth(t,0.5).*sin(i*w*t);
am(i) = 2*quad(f1,0,1);
bm(i) = 2*quad(f2,0,1);
end
end
However it's not getting anywhere near the values I need. The b_m coefficients are given for a
triangle wave and are supposed to be 1/m^2 and -1/m^2 when m is odd alternating beginning with the positive term.
The major issue for me is that I don't quite understand how integrals work in MATLAB and I'm not sure whether or not the approach I've chosen works.
Edit:
To clairify, this is the form that I'm looking to write the function on when the coefficients have been determined:
Here's an attempt using fft:
function [ a0,am,bm ] = test( numTerms )
T=2*pi;
w=1;
t = [0:0.1:2];
f = fft(sawtooth(t,0.5));
am = real(f);
bm = imag(f);
func = num2str(f(1));
for i = 1:numTerms
func = strcat(func,'+',num2str(am(i)),'*cos(',num2str(i*w),'*t)','+',num2str(bm(i)),'*sin(',num2str(i*w),'*t)');
end
y = inline(func);
plot(t,y(t));
end
Looks to me that your problem is what sawtooth returns the mathworks documentation says that:
sawtooth(t,width) generates a modified triangle wave where width, a scalar parameter between 0 and 1, determines the point between 0 and 2π at which the maximum occurs. The function increases from -1 to 1 on the interval 0 to 2πwidth, then decreases linearly from 1 to -1 on the interval 2πwidth to 2π. Thus a parameter of 0.5 specifies a standard triangle wave, symmetric about time instant π with peak-to-peak amplitude of 1. sawtooth(t,1) is equivalent to sawtooth(t).
So I'm guessing that's part of your problem.
After you responded I looked into it some more. Looks to me like it's the quad function; not very accurate! I recast the problem like this:
function [ a0,am,bm ] = sotest( t, numTerms )
bm = zeros(1,numTerms);
am = zeros(1,numTerms);
% 2L = 1
L = 0.5;
for ii = 1:numTerms
am(ii) = (1/L)*quadl(#(x) aCos(x,ii,L),0,2*L);
bm(ii) = (1/L)*quadl(#(x) aSin(x,ii,L),0,2*L);
end
ii = 0;
a0 = (1/L)*trapz( t, t.*cos((ii*pi*t)/L) );
% now let's test it
y = ones(size(t))*(a0/2);
for ii=1:numTerms
y = y + am(ii)*cos(ii*2*pi*t);
y = y + bm(ii)*sin(ii*2*pi*t);
end
figure; plot( t, y);
end
function a = aCos(t,n,L)
a = t.*cos((n*pi*t)/L);
end
function b = aSin(t,n,L)
b = t.*sin((n*pi*t)/L);
end
And then I called it like:
[ a0,am,bm ] = sotest( t, 100 );
and I got:
Sweetness!!!
All I really changed was from quad to quadl. I figured that out by using trapz which worked great until the time vector I was using didn't have enough resolution, which led me to believe it was a numerical issue rather than something fundamental. Hope this helps!
To troubleshoot your code I would plot the functions you are using and investigate, how the quad function samples them. You might be undersampling them, so make sure your minimum step size is smaller than the period of the function by at least factor 10.
I would suggest using the FFTs that are built-in to Matlab. Not only is the FFT the most efficient method to compute a spectrum (it is n*log(n) dependent on the length n of the array, whereas the integral in n^2 dependent), it will also give you automatically the frequency points that are supported by your (equally spaced) time data. If you compute the integral yourself (might be needed if datapoints are not equally spaced), you might calculate frequency data that are not resolved (closer spacing than 1/over the spacing in time, i.e. beyond the 'Fourier limit').

MatLab - nargout

I am learning MatLab on my own, and I have this assignment in my book which I don't quite understand. Basically I am writing a function that will calculate sine through the use of Taylor series. My code is as follows so far:
function y = sine_series(x,n);
%SINE_SERIES: computes sin(x) from series expansion
% x may be entered as a vector to allow for multiple calculations simultaneously
if n <= 0
error('Input must be positive')
end
j = length(x);
k = [1:n];
y = ones(j,1);
for i = 1:j
y(i) = sum((-1).^(k-1).*(x(i).^(2*k -1))./(factorial(2*k-1)));
end
The book is now asking me to include an optional output err which will calculate the difference between sin(x) and y. The book hints that I may use nargout to accomplish this, but there are no examples in the book on how to use this, and reading the MatLab help on the subject did not make my any wiser.
If anyone can please help me understand this, I would really appreciate it!
The call to nargout checks for the number of output arguments a function is called with. Depending on the size of nargout you can assign entries to the output argument varargout. For your code this would look like:
function [y varargout]= sine_series(x,n);
%SINE_SERIES: computes sin(x) from series expansion
% x may be entered as a vector to allow for multiple calculations simultaneously
if n <= 0
error('Input must be positive')
end
j = length(x);
k = [1:n];
y = ones(j,1);
for i = 1:j
y(i) = sum((-1).^(k-1).*(x(i).^(2*k -1))./(factorial(2*k-1)));
end
if nargout ==2
varargout{1} = sin(x)'-y;
end
Compare the output of
[y] = sine_series(rand(1,10),3)
and
[y err] = sine_series(rand(1,10),3)
to see the difference.

Solving Kepler's Equation computationally

I'm trying to solve Kepler's Equation as a step towards finding the true anomaly of an orbiting body given time. It turns out though, that Kepler's equation is difficult to solve, and the wikipedia page describes the process using calculus. Well, I don't know calculus, but I understand that solving the equation involves an infinite number of sets which produce closer and closer approximations to the correct answer.
I can't see from looking at the math how to do this computationally, so I was hoping someone with a better maths background could help me out. How can I solve this beast computationally?
FWIW, I'm using F# -- and I can calculate the other elements necessary for this equation, it's just this part I'm having trouble with.
I'm also open to methods which approximate the true anomaly given time, periapsis distance, and eccentricity
This paper:
A Practical Method for Solving the Kepler Equation http://murison.alpheratz.net/dynamics/twobody/KeplerIterations_summary.pdf
shows how to solve Kepler's equation using an iterative computing method. It should be fairly straightforward to translate it to the language of your choice.
You might also find this interesting. It's an ocaml program, part of which claims to contain a Kepler Equation solver. Since F# is in the ML family of languages (as is ocaml), this might provide a good starting point.
wanted to drop a reply in here in case this page gets found by anyone else looking for similar materials.
The following was written as an "expression" in Adobe's After Effects software, so it's javascriptish, although I have a Python version for a different app (cinema 4d). The idea is the same: execute Newton's method iteratively until some arbitrary precision is reached.
Please note that I'm not posting this code as exemplary or meaningfully efficient in any way, just posting code we produced on a deadline to accomplish a specific task (namely, move a planet around a focus according to Kepler's laws, and do so accurately). We don't write code for a living, and so we're also not posting this for critique. Quick & dirty is what meets deadlines.
In After Effects, any "expression" code is executed once - for every single frame in the animation. This restricts what one can do when implementing many algorithms, due to the inability to address global data easily (other algorithms for Keplerian motion use interatively updated velocity vectors, an approach we couldn't use). The result the code leaves behind is the [x,y] position of the object at that instant in time (internally, this is the frame number), and the code is intended to be attached to the position element of an object layer on the timeline.
This code evolved out of material found at http://www.jgiesen.de/kepler/kepler.html, and is offered here for the next guy.
pi = Math.PI;
function EccAnom(ec,am,dp,_maxiter) {
// ec=eccentricity, am=mean anomaly,
// dp=number of decimal places
pi=Math.PI;
i=0;
delta=Math.pow(10,-dp);
var E, F;
// some attempt to optimize prediction
if (ec<0.8) {
E=am;
} else {
E= am + Math.sin(am);
}
F = E - ec*Math.sin(E) - am;
while ((Math.abs(F)>delta) && (i<_maxiter)) {
E = E - F/(1.0-(ec* Math.cos(E) ));
F = E - ec * Math.sin(E) - am;
i = i + 1;
}
return Math.round(E*Math.pow(10,dp))/Math.pow(10,dp);
}
function TrueAnom(ec,E,dp) {
S=Math.sin(E);
C=Math.cos(E);
fak=Math.sqrt(1.0-ec^2);
phi = 2.0 * Math.atan(Math.sqrt((1.0+ec)/(1.0-ec))*Math.tan(E/2.0));
return Math.round(phi*Math.pow(10,dp))/Math.pow(10,dp);
}
function MeanAnom(time,_period) {
curr_frame = timeToFrames(time);
if (curr_frame <= _period) {
frames_done = curr_frame;
if (frames_done < 1) frames_done = 1;
} else {
frames_done = curr_frame % _period;
}
_fractime = (frames_done * 1.0 ) / _period;
mean_temp = (2.0*Math.PI) * (-1.0 * _fractime);
return mean_temp;
}
//==============================
// a=semimajor axis, ec=eccentricity, E=eccentric anomaly
// delta = delta digits to exit, period = per., in frames
//----------------------------------------------------------
_eccen = 0.9;
_delta = 14;
_maxiter = 1000;
_period = 300;
_semi_a = 70.0;
_semi_b = _semi_a * Math.sqrt(1.0-_eccen^2);
_meananom = MeanAnom(time,_period);
_eccentricanomaly = EccAnom(_eccen,_meananom,_delta,_maxiter);
_trueanomaly = TrueAnom(_eccen,_eccentricanomaly,_delta);
r = _semi_a * (1.0 - _eccen^2) / (1.0 + (_eccen*Math.cos(_trueanomaly)));
x = r * Math.cos(_trueanomaly);
y = r * Math.sin(_trueanomaly);
_foc=_semi_a*_eccen;
[1460+x+_foc,540+y];
You could check this out, implemented in C# by Carl Johansen
Represents a body in elliptical orbit about a massive central body
Here is a comment from the code
True Anomaly in this context is the
angle between the body and the sun.
For elliptical orbits, it's a bit
tricky. The percentage of the period
completed is still a key input, but we
also need to apply Kepler's
equation (based on the eccentricity)
to ensure that we sweep out equal
areas in equal times. This
equation is transcendental (ie can't
be solved algebraically) so we
either have to use an approximating
equation or solve by a numeric method.
My implementation uses
Newton-Raphson iteration to get an
excellent approximate answer (usually
in 2 or 3 iterations).