How does GNU Octave matrix division work? Getting unexpected behaviour. - octave

In GNU Octave, How does matrix division work?
Instead of doing
1./[1;1]
I accidentally did
1/[1;1]
To my surprise this yields:
[0.5, 0.5]
The transverse case:
1/[1,1]
gives the expected:
error: operator /: nonconformant arguments (op1 is 1x1, op2 is 1x2)
Can someone explain the [0.5, 0.5] result?

Consider the following example:
>> A = [5 10];
>> B = [2 2];
If you want element-wise division, use A ./ B with the matrix size of both elements equal i.e If A is of size m∗n B must be of size m∗n
>> A ./B
ans =
2.5000 5.0000
If you want matrix by matrix division, use A / B with the matrix size of element A as m∗n and B as q∗n or m∗n. The / operator is trying to return x∗y−1 (i.e x * pinv(y) in octave format).
>> A / B
ans = 3.7500
which is same as
>> A * pinv(B)
ans = 3.7500
pinv() function in OCTAVE/MATLAB returns the Moore-Penrose pseudo inverse of matrix, whereas the inv() function returns the inverse of the matrix.
If you are confused about what to use, use pinv()
If you want further clarification about What is the difference between pinv and inv?

this is a answer i got from Alan Boulton at the coursera machine learning course discussion forum:
The gist of the idea is that x / y is defined quite generally so that it can deal with matrices. Conceptually the / operator is trying to return x∗y−1 (or x * inv(y) in Octave-speak), as in the following example:
octave:1> eye(2)/[1 2;3 4]
ans =
-2.00000 1.00000
1.50000 -0.50000
octave:2> inv([1 2;3 4])
ans =
-2.00000 1.00000
1.50000 -0.50000
The trickiness happens when y is a column vector, in which case the inv(y) is undefined, so pinv(y), the psuedoinverse of y, is used.
octave:1> pinv([1;2])
ans =
0.20000 0.40000
octave:2> 1/[1;2]
ans =
0.20000 0.40000
The vector y needs to be compatible with x so that x * pinv(y) is well-defined. So it's ok if y is a row vector, so long as x is compatible. See the following Octave session for illustration:
octave:18> pinv([1 2])
ans =
0.20000
0.40000
octave:19> 1/[1 2]
error: operator /: nonconformant arguments (op1 is 1x1, op2 is 1x2)
octave:19> eye(2)/[1 2]
ans =
0.20000
0.40000
octave:20> eye(2)/[1;2]
error: operator /: nonconformant arguments (op1 is 2x2, op2 is 2x1)
octave:20> 1/[1;2]
ans =
0.20000 0.40000

Matrix division with Octave explained:
A formal description of Octave Matrix Division from here
http://www.gnu.org/software/octave/doc/interpreter/Arithmetic-Ops.html
x / y
Right division. This is conceptually equivalent to the expression
(inverse (y') * x')'
But it is computed without forming the inverse of y'.
If the system is not square, or if the coefficient matrix is
singular, a minimum norm solution is computed.
What that means is that these two should be the same:
[3 4]/[4 5; 6 7]
ans =
1.50000 -0.50000
(inverse([4 5; 6 7]') * [3 4]')'
ans =
1.50000 -0.50000
First, understand that Octave matrix division is not commutative, just like matrix Multiplication is not commutative.
That means A / B does not equal B / A
1/[1;1]
ans =
0.50000 0.50000
[1;1]/1
ans =
1
1
One divided by a matrix with a single value one is one:
1/[1]
ans = 1
One divided by a matrix with a single value three is 0.33333:
1/[3]
ans = .33333
One divided by a (1x2) matrix:
1/[1;1]
ans =
0.50000 0.50000
Equivalent:
([1/2;1/2] * 1)'
ans =
0.50000 0.50000
Notice above, like the instructions said, we are taking the norm of the vector. So you see how the [1;1] was turned into [1/2; 1/2]. The '2' comes from the length of the vector, the 1 comes from the supplied vector. We'll do another:
One divided by a (1x3) matrix:
1/[1;1;1]
ans =
0.33333 0.33333 0.33333
Equivalent:
([1/3;1/3;1/3] * 1)'
ans =
0.33333 0.33333 0.33333
What if one of the elements are negative...
1/[1;1;-1]
ans =
0.33333 0.33333 -0.33333
Equivalent:
([1/3;1/3;-1/3] * 1)'
ans =
0.33333 0.33333 -0.33333
So now you have a general idea of what Octave is doing when you don't supply it a square matrix. To understand what Octave matrix division does when you pass it a square matrix you need to understand what the inverse function is doing.
I've been normalizing your vectors by hand, if you want octave to do them you can add packages to do so, I think the following package will do what I've been doing with the vector normalization:
http://octave.sourceforge.net/geometry/function/normalizeVector.html
So now you can convert the division into an equivalent multiplication. Read this article on how matrix multiplication works, and you can backtrack and figure out what is going on under the hood of a matrix division.
http://www.purplemath.com/modules/mtrxmult2.htm

Related

Non-linear fit Gnu Octave

I have a problem in performing a non linear fit with Gnu Octave. Basically I need to perform a global fit with some shared parameters, while keeping others fixed.
The following code works perfectly in Matlab, but Octave returns an error
error: operator *: nonconformant arguments (op1 is 34x1, op2 is 4x1)
Attached my code and the data to play with:
clear
close all
clc
pkg load optim
D = dlmread('hd', ';'); % raw data
bkg = D(1,2:end); % 4 sensors bkg
x = D(2:end,1); % input signal
Y = D(2:end,2:end); % 4 sensors reposnse
W = 1./Y; % weights
b0 = [7 .04 .01 .1 .5 2 1]; % educated guess for start the fit
%% model function
F = #(b) ((bkg + (b(1) - bkg).*(1-exp(-(b(2:5).*x).^b(6))).^b(7)) - Y) .* W;
opts = optimset("Display", "iter");
lb = [5 .001 .001 .001 .001 .01 1];
ub = [];
[b, resnorm, residual, exitflag, output, lambda, Jacob\] = ...
lsqnonlin(F,b0,lb,ub,opts)
To give more info, giving array b0, b0(1), b0(6) and b0(7) are shared among the 4 dataset, while b0(2:5) are peculiar of each dataset.
Thank you for your help and suggestions! ;)
Raw data:
0,0.3105,0.31342,0.31183,0.31117
0.013229,0.329,0.3295,0.332,0.372
0.013229,0.328,0.33,0.33,0.373
0.021324,0.33,0.3305,0.33633,0.399
0.021324,0.325,0.3265,0.333,0.397
0.037763,0.33,0.3255,0.34467,0.461
0.037763,0.327,0.3285,0.347,0.456
0.069405,0.338,0.3265,0.36533,0.587
0.069405,0.3395,0.329,0.36667,0.589
0.12991,0.357,0.3385,0.41333,0.831
0.12991,0.358,0.3385,0.41433,0.837
0.25368,0.393,0.347,0.501,1.302
0.25368,0.3915,0.3515,0.498,1.278
0.51227,0.458,0.3735,0.668,2.098
0.51227,0.47,0.3815,0.68467,2.124
1.0137,0.61,0.4175,1.008,3.357
1.0137,0.599,0.422,1,3.318
2.0162,0.89,0.5335,1.645,5.006
2.0162,0.872,0.5325,1.619,4.938
4.0192,1.411,0.716,2.674,6.595
4.0192,1.418,0.7205,2.691,6.766
8.0315,2.34,1.118,4.195,7.176
8.0315,2.33,1.126,4.161,6.74
16.04,3.759,1.751,5.9,7.174
16.04,3.762,1.748,5.911,7.151
32.102,5.418,2.942,7.164,7.149
32.102,5.406,2.941,7.164,7.175
64.142,7.016,4.478,7.174,7.176
64.142,7.018,4.402,7.175,7.175
128.32,7.176,6.078,7.175,7.176
128.32,7.175,6.107,7.175,7.173
255.72,7.165,7.162,7.165,7.165
255.72,7.165,7.164,7.166,7.166
511.71,7.165,7.165,7.165,7.165
511.71,7.165,7.165,7.166,7.164
Giving the function definition above, if you call it by F(b0) in the command windows, you will get a 34x4 matrix which is correct, since variable Y has the same size.
In that way I can (in theory) compute the standard formula for lsqnonlin (fit - measured)^2

Subscript indices must be real positive integers or logicals

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
%
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCost) and gradient here.
%
hypothesis = x*theta;
theta_0 = theta(1) - alpha(1/m)*sum((hypothesis-y)*x);
theta_1 = theta(2) - alpha(1/m)*sum((hypothesis-y)*x);
theta(1) = theta_0;
theta(2) = theta_1;
% ============================================================
% Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta);
end
end
I keep getting this error
error: gradientDescent: subscript indices must be either positive integers less than 2^31 or logicals
on this line right in-between the first theta and =
theta_0 = theta(1) - alpha(1/m)*sum((hypothesis-y)*x);
I'm very new to octave so please go easy on me, and
thank you in advance.
This is from the coursera course on Machine Learning from Week 2
99% sure your error is on the line pointed out by topsig, where you have alpha(1/m)
it would help if you gave an example of input values to your function and what you hoped to see as an output, but I'm assuming from your comment
% taking num_iters gradient steps with learning rate alpha
that alpha is a constant, not a function. as such, you have the line alpha(1/m) without any operator in between. octave sees this as you indexing alpha with the value of 1/m.
i.e., if you had an array
x = [3 4 5]
x*(2) = [6 8 10] %% two times each element in the array
x(2) = [4] %% second element in the array
what you did doesn't seem to make sense, as 'm = length(y)' which will output a scalar, so
x = [3 4 5]; m = 3;
x*(1/m) = x*(1/3) = [1 1.3333 1.6666] %% element / 3
x(1/m) = ___error___ %% the 1/3 element in the array makes no sense
note that for certain errors it always indicates that the location of the error is at the assignment operator (the equal sign at the start of the line). if it points there, you usually have to look elsewhere in the line for the actual error. here, it was yelling at you for trying to apply a non-integer subscript (1/m)

How do I assign variables in matrices?

I can't make matrices with variables in it for some reason. I get following message.
>>> A= [a b ;(-1-a) (1-b); (1+a) b]
error: horizontal dimensions mismatch (2x3 vs 1x1)
Why is it? Please show me correct way if I'm wrong.
In Matlab you first need to assign a variable before you can use it,
a = 1;
b = a+1;
This will thus give an error,
clear;
b = a+1; % ERROR! Undefined function or variable 'a
Matlab does never accept unassigned variables. This is because, on the lowest level, you do not have a. You will have machine code which is assgined the value of a. This is handled by the JIT compiler in Matlab, so you do not need to worry about this though.
If you want to use something as the variable which you have in maths you can specifically express this to matlab. The object is called a sym and the syntax that define the sym x to a variable xis,
syms x;
That said, you can define a vector or a matrix as,
syms a b x y; % Assign the syms
A = [x y]; % Vector
B = A= [a b ;(-1-a) (1-b); (1+a) b]; % Matrix.
The size of a matrix can be found with size(M) or for dim n size(M,n). You can calcuate the matrix product M3=M1*M2 if and only if M1 have the size m * n and M2 have the size n * p. The size of M3 will then be m * p. This will also mean that the operation A^N = A * A * ... is only allowed when m=n so to say, the matrix is square. This can be verified in matlab by the comparison,
syms a b
A = [a,1;56,b]
if size(A,1) == size(A,2)
disp(['A is a square matrix of size ', num2str(size(A,1)]);
else
disp('A is not square');
end
These are the basic rules for assigning variables in Matlab as well as for matrix multiplication. Further, a google search on the error error: 'x' undefined does only give me octave hits. Are you using octave? In that case I cannot guarantee that you can use sym objects or that the syntaxes are correct.

Using arrayfun to apply two arguments of a function on every combination

Let i = [1 2] and j = [3 5]. Now in octave:
arrayfun(#(x,y) x+y,i,j)
we get [4 7]. But I want to apply the function on the combinations of i vs. j to get [i(1)+j(1) i(1)+j(2) i(2)+j(1) i(2)+j(2)]=[4 6 5 7].
How do I accomplish this? I know I can go with for-loopsl but I want vectorized-code because it's faster.
In Octave, for finding summations between two vectors, you can use a truly vectorized approach with broadcasting like so -
out = reshape(ii(:).' + jj(:),[],1)
Here's a runtime test on ideone for the input vectors of size 1 x 100 each -
-------------------- With FOR-LOOP
Elapsed time is 0.148444 seconds.
-------------------- With BROADCASTING
Elapsed time is 0.00038299 seconds.
If you want to keep it generic to accommodate operations other than just summations, you can use anonymous functions like so -
func1 = #(I,J) I+J;
out = reshape(func1(ii,jj.'),1,[])
In MATLAB, you could accomplish the same with two bsxfun alternatives as listed next.
I. bsxfun with Anonymous Function -
func1 = #(I,J) I+J;
out = reshape(bsxfun(func1,ii(:).',jj(:)),1,[]);
II. bsxfun with Built-in #plus -
out = reshape(bsxfun(#plus,ii(:).',jj(:)),1,[]);
With the input vectors of size 1 x 10000 each, the runtimes at my end were -
-------------------- With FOR-LOOP
Elapsed time is 1.193941 seconds.
-------------------- With BSXFUN ANONYMOUS
Elapsed time is 0.252825 seconds.
-------------------- With BSXFUN BUILTIN
Elapsed time is 0.215066 seconds.
First, your first example is not the best because the most efficient way to accomplish what you're doing with arrayfun would be to vectorize:
a = [1 2];
b = [3 5];
out = a+b
Second, in Matlab at least, arrayfun is not necessarily faster than a simple for loop. arrayfun is mainly a convenience (especially for it's more advanced options). Try this simple timing example yourself:
a = 1:1e5;
b = a+1;
y = arrayfun(#(x,y)x+y,a,b); % Warm up
tic
y = arrayfun(#(x,y)x+y,a,b);
toc
y = zeros(1,numel(a));
for k = 1:numel(a)
y(k) = a(k)+b(k); % Warm up
end
tic
y = zeros(1,numel(a));
for k = 1:numel(a)
y(k) = a(k)+b(k);
end
toc
In Matlab R2015a, the for loop method is over 70 times faster run from the Command window and over 260 times faster when run from an M-file function. Octave may be different, but you should experiment.
Finally, you can accomplish what you want using meshgrid:
a = [1 2];
b = [3 5];
[x,y] = meshgrid(a,b);
out = x(:).'+y(:).'
which returns [4 6 5 7] as in your question. You can also use ndgrid to get output in a different order.

Plotting a 3D function with Octave

I am having a problem graphing a 3d function - when I enter data, I get a linear graph and the values don't add up if I perform the calculations by hand. I believe the problem is related to using matrices.
INITIAL_VALUE=999999;
INTEREST_RATE=0.1;
MONTHLY_INTEREST_RATE=INTEREST_RATE/12;
# ranges
down_payment=0.2*INITIAL_VALUE:0.1*INITIAL_VALUE:INITIAL_VALUE;
term=180:22.5:360;
[down_paymentn, termn] = meshgrid(down_payment, term);
# functions
principal=INITIAL_VALUE - down_payment;
figure(1);
plot(principal);
grid;
title("Principal (down payment)");
xlabel("down payment $");
ylabel("principal $ (amount borrowed)");
monthly_payment = (MONTHLY_INTEREST_RATE*(INITIAL_VALUE - down_paymentn))/(1 - (1 + MONTHLY_INTEREST_RATE)^-termn);
figure(2);
mesh(down_paymentn, termn, monthly_payment);
title("monthly payment (principal(down payment)) / term months");
xlabel("principal");
ylabel("term (months)");
zlabel("monthly payment");
The 2nd figure like I said doesn't plot like I expect. How can I change my formula for it to render properly?
I tried your script, and got the following error:
error: octave_base_value::array_value(): wrong type argument `complex matrix'
...
Your monthly_payment is a complex matrix (and it shouldn't be).
I guess the problem is the power operator ^. You should be using .^ for element-by-element operations.
From the documentation:
x ^ y
x ** y
Power operator. If x and y are both scalars, this operator returns x raised to the power y. If x is a scalar and y is a square matrix, the result is computed using an eigenvalue expansion. If x is a square matrix. the result is computed by repeated multiplication if y is an integer, and by an eigenvalue expansion if y is not an integer. An error results if both x and y are matrices.
The implementation of this operator needs to be improved.
x .^ y
x .** y
Element by element power operator. If both operands are matrices, the number of rows and columns must both agree.