Automatic broadcasting warning: How to compare matrix rows to vector in Octave - octave

I'm getting the warning:
warning: mx_el_eq: automatic broadcasting operation applied
From the code:
f = [1;2;3];
f == 1:3;
warning: mx_el_eq: automatic broadcasting operation applied
Can this can be done without warnings?

This is because you are comparing column vector f with row vector 1:3. In Matlab this would be an error however Octave automatically broadcasts. This means that in order to apply the == operator it will expand one of your vectors along a singleton dimension (i.e. a dimension of size 1). In you case both vectors have a singleton dimension to expand so you essentially get the equivalent of:
a1 = [1 1 1;
2 2 2;
3 3 3]; %// for f
a2 = [1 2 3
1 2 3
1 2 3]; %// for 1:3
a1 == a2
Note that in order to get the same result in Matlab you would have to directly call bsxfun
bsxfun(#eq, f, 1:3)
In order to compare you vectors elementwise without the broadcasting you just need to transpose one of them:
f' == 1:3

Automatic broadcasting was a new feature introduced in Octave 3.6. It surprised many people (which were expecting an error), so it was decided to throw a warning. To disable this warning you'll need to turn it off with:
warning ("off", "Octave:broadcast");
You can also turn it off only in the scope of your function:
warning ("off", "Octave:broadcast", "local");
However, I'd recommend you do it in your .octaverc file instead.
The problem with the decision of throwing a warning is that it sounds like you are doing something wrong when you're really not. So as of Octave 4.0, that warning got removed (it is now part of the "Octave:language-extension" warning id).

Related

Octave eigs() function bugged?

Running Octave 6.3.0 for Windows. I need to get the smallest eigenvalue of some matrix.eigs(A,1,"sm") is supposed to do that, but I often get wrong results with singular matrices.
eigs(A) (which returns all the the first 6 eigenvalues/vectors) is correct (at least at the machine precision):
>> A = [[1 1 1];[1 1 1];[1 1 1]]
A =
1 1 1
1 1 1
1 1 1
>> [v lambda flag] = eigs(A)
v =
0.5774 -0.3094 -0.7556
0.5774 -0.4996 0.6458
0.5774 0.8091 0.1098
lambda =
Diagonal Matrix
3.0000e+00 0 0
0 -4.5198e-16 0
0 0 -1.5831e-17
flag = 0
But eigs(A,1,"sm") is not:
>> [v lambda flag] = eigs(A,1,"sm")
warning: eigs: 'A - sigma*B' is singular, indicating sigma is exactly an eigenvalue so convergence is not guaranteed
warning: called from
eigs at line 298 column 20
warning: matrix singular to machine precision
warning: called from
eigs at line 298 column 20
warning: matrix singular to machine precision
warning: called from
eigs at line 298 column 20
warning: matrix singular to machine precision
warning: called from
eigs at line 298 column 20
warning: matrix singular to machine precision
warning: called from
eigs at line 298 column 20
v =
-0.7554
0.2745
0.5950
lambda = 0.4322
flag = 0
Not only the returned eigenvalue is wrong, but the returned flag is zero, indicating that every went right in the function...
Is it a wrong usage of eigs() (but from the doc I can't see what is wrong) or a bug?
EDIT: if not a bug, at least a design issue... No problem either when requesting the 2 smallest values instead of the smallest value alone.
>> eigs(A,2,"sm")
ans =
-1.7700e-17
-5.8485e-16
EDIT 2: the eigs() function in Matlab online just runs fine and return the correct results (at the machine precision)
>> A=ones(3)
A =
1 1 1
1 1 1
1 1 1
>> [v lambda flag] = eigs(A,1,"smallestabs")
v =
-0.7556
0.6458
0.1098
lambda =
-1.5831e-17
flag =
0
After more tests and investigations I think I can answer that yes, Octave eigs() has some flaw.
eigs(A,1,"sm") likely uses the inverse power iteration method, that is repeatedly solving y=A\x, then x=y, starting with an arbitrary x vector. Obviously there's a problem if A is singular. However:
Matlab eigs() runs fine in such case, and returns the correct eigenvalue (at the machine precision). I don't know what it does, maybe adding a tiny value on the diagonal if the matrix is detected as singular, but it does something better (or at least different) than Octave.
If for some (good or bad) reason Octave's algorithm cannot handle a singular matrix, then this should be reflected in the 3rd return argument ("flag"). Instead, it is always zero as if everything went OK.
eigs(A,1,"sm") is actually equivalent to eigs(A,1,0), and the more general syntax is eigs(A,1,sigma), which means "find the closest eigenvalue to sigma, and the associated eigenvector". For this, the inverse power iteration method is applied with the matrix A-sigma*I. Problem: if sigma is already an exact eigenvalue this matrix is singular by definition. Octave eigs() fails in this case, while Matlab eigs() succeeds. It's kind of weird to have a failure when one knows in advance the exact eigenvalue, or sets it by chance. So the right thing to do in Octave is to test if (A-sigma.I) is singular, and if yes add a tiny value to sigma: eigs(A,1,sigma+eps*norm(A)). Matlab eigs() probably does something like that.

How do I fix the index error in my Octave code?

I'm having issues with the index for my code. I'm trying to create a code on Octave for the power method (vector iteration) and the error: 'x(4): out of bound 3' keeps popping up at line 6.
A=[6,-2,2,4;0,-4,2,2;0,0,2,-5;0,0,0,-3]
b=[12;10;-9;-3]
n=4
for i=rows(A):-1:1
for j=i+1:rows(A)
x(i)=[b(i)-A(i,j)*x(j)]/A(i,i); #error: 'x(4): out of bound 3'
endfor
endfor
x
In the following line, note that you have x appearing twice; the first seeks to assign to it, but the second simply tries to access its value:
x(i) = [ b(i) - A(i,j) * x(j) ] / A(i,i);
⬑ assignment ⬑ access
Assigning to an index that doesn't exist (yet) is absolutely fine; octave will simply fill in the intervening values with 'zeroes'. E.g.
>> clear x
>> x(3) = 1 % output: x = [0, 0, 1]
However, trying to access an index which doesn't exist yet is an error, since there's nothing there to access. This results in an "out of bound" error (and, in its error message, octave is kind enough to tell you what the last legitimate index is that you can access in that particular array).
Therefore, this is an error:
>> clear x
>> x(3) = 1 % output: x = [0, 0, 1]
>> 1 + x(4) % output: error: x(4): out of bound 3
Now going back to your specific code, you are trying to access something that doesn't exist yet. The reason it doesn't exist yet, is that you have set up your for loops such that j will achieve a higher value than i at a particular step, such that you are trying to access x(j), which does not exist yet, in order to assign it to x(i), where i < j. Therefore this results in an out of bounds error (you are trying to access index j when you only have up to i available).
In your particular case, octave informs you that this happened when j was 4, and i was 3.
PS: I will echo #HansHirse's implied warning here, that you should always pay attention to your variables, and clear them appropriately in your scripts, especially if you plan to run it many times. Never use a variable that you haven't defined (or cleared) beforehand. Otherwise, x here may not be undefined when you run your script, say, a second time. This leads to all sorts of problems, e.g., your code works but for the wrong reasons, and then fails to work again when you run it the next day and x is now undefined. In this particular example, if you had an x in your workspace which had the right number of elements in it, your code would "work" but produce the wrong result, and you wouldn't know any better.

error: 'y' undefined near line 8 column 12 error: called from computeCost at line 8 column 3

error: 'y' undefined near line 8 column 12
error: called from computeCost at line 8 column 3
Here is my code:
1;
function J = computeCost(X, y, theta)
%COMPUTECOST Compute cost for linear regression
% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y
% Initialize some useful values
m = length(y); % number of training examples
% You need to return the following variables correctly
J = 0;
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.
J = sum(( X * theta - y ) .^2 )/( 2 * m );
% =========================================================================
end
I am guessing it's an error from Coursera ML course assignment. I think you are trying to run the file which contains the implementation of function computeCost(X, y, theta), not the file which calls the computeCost(,,) function with values of X, y, theta. This is why you are getting the error as you aren't providing y.
Run the file which is calling computeCost() function, not the file which contains the implementation of computeCost() function.
That is:
For Week2 Assignment 1: Run ex1.m file
For Week3 Assignment 2: Run ex2.m file
There are two things happening here. First you are defining your function dynamically as opposed to in its own file; not sure why you would prefer that.
Second, after having defined this computeCost function, you are calling it from a context where you did not pass a y argument (or presumably, you didn't pass any arguments to it, and y happens to be the first one detected as missing inside the function).
Since this is a cost function and your code looks suspiciously like code from Andrew Ng's Machine Learning course on Coursera, I am going to go out on a limb here and guess that you called computeCost from something else that was supposed to use it as a cost function to be optimised, e.g. fminunc. Typically functions like fminunc expect a function handle as an argument, but they expect a very specific function handle too. If you look at the help of fminunc, it states that:
FCN should accept a vector (array) defining the unknown variables,
and return the objective function value, optionally with gradient.
Therefore, if you want to pass a function that is to be computed with three arguments, you need to "wrap" it into your own handle, which you can define on the spot, e.g. #(x) computeCost(x, y, t) (assuming 'y' and 't' exist already).
So, I'm guessing that instead of calling fminunc like so: fminunc( #(x) computeCost(x, y, t),
you probably called it like so: fminunc( #computeCost )
or even like so: fminunc( computeCost ) (which evaluates the function first, rather than pass a function handle as an argument).
Basically, go back to the code given to you by coursera, or read the notes carefully. You're calling things the wrong way.
Actually, you are trying to run a function and you can't run it until you provide the desired no. of parameters. Doing so, you may encounter the following error:
computeCost error: 'y' undefined near line 7 column 12
error: called from computeCost at line 7 column 3
As you see, here I'm calling this function without passing any argument.
SOLUTION:
You can test your code by running 'ex1' script. After that submit your work by calling 'submit' script.

Octave -inf and NaN

I searched the forum and found this thread, but it does not cover my question
Two ways around -inf
From a Machine Learning class, week 3, I am getting -inf when using log(0), which later turns into an NaN. The NaN results in no answer being given in a sum formula, so no scalar for J (a cost function which is the result of matrix math).
Here is a test of my function
>> sigmoid([-100;0;100])
ans =
3.7201e-44
5.0000e-01
1.0000e+00
This is as expected. but the hypothesis requires ans = 1-sigmoid
>> 1-ans
ans =
1.00000
0.50000
0.00000
and the Log(0) gives -Inf
>> log(ans)
ans =
0.00000
-0.69315
-Inf
-Inf rows do not add to the cost function, but the -Inf carries through to NaN, and I do not get a result. I cannot find any material on -Inf, but am thinking there is a problem with my sigmoid function.
Can you provide any direction?
The typical way to avoid infinity in these cases is to add eps to the operand:
log(ans + eps)
eps is a very, very small value, and won't affect the output for values of ans unless ans is zero:
>> z = [-100;0;100];
>> g = 1 ./ (1+exp(-z));
>> log(1-g + eps)
ans =
0.0000
-0.6931
-36.0437
Adding to the answers here, I really do hope you would provide some more context to your question (in particular, what are you actually trying to do.
I will go out on a limb and guess the context, just in case this is useful. You are probably doing machine learning, and trying to define a cost function based on the negative log likelihood of a model, and then trying to differentiate it to find the point where this cost is at its minimum.
In general for a reasonable model with a useful likelihood that adheres to Cromwell's rule, you shouldn't have these problems, but, in practice it happens. And presumably in the process of trying to calculate a negative log likelihood of a zero probability you get inf, and trying to calculate a differential between two points produces inf / inf = nan.
In this case, this is an 'edge case', and generally in computer science edge cases need to be spotted as exceptional circumstances and dealt with appropriately. The reality is that you can reasonably expect that inf isn't going to be your function's minimum! Therefore, whether you remove it from the calculations, or replace it by a very large number (whether arbitrarily or via machine precision) doesn't really make a difference.
So in practice you can do either of the two things suggested by others here, or even just detect such instances and skip them from the calculation. The practical result should be the same.
-inf means negative infinity. Which is the correct answer because log of (0) is minus infinity by definition.
The easiest thing to do is to check your intermediate results and if the number is below some threshold (like 1e-12) then just set it to that threshold. The answers won't be perfect but they will still be pretty close.
Using the following as the sigmoid function:
function g = sigmoid(z)
g = 1 ./ (1 + e.^-z);
end
Then the following code runs with no issues. Choose the threshold value in the 'max' statement to be less than the expected noise in your measurements and then you're good to go
>> a = sigmoid([-100, 0, 100])
a =
3.7201e-44 5.0000e-01 1.0000e+00
>> b = 1-a
b =
1.00000 0.50000 0.00000
>> c = max(b, 1e-12)
c =
1.0000e+00 5.0000e-01 1.0000e-12
>> d = log(c)
d =
0.00000 -0.69315 -27.63102

Keep getting the error message "Arguments are not sufficiently instantiated" can't understand why

Keep getting the error Arguments are not sufficiently instantiated for the multiplication by addition rule I wrote as shown below.
mult(_, 0, 0). %base case for multiplying by 0
mult(X, 1, X). % another base case
mult(X, Y, Z) :-
Y > 1,
Y1 is Y - 1,
mult(X, Y1, Z1),
Z is X + Z1.
I am new to Prolog and really struggling with even such simple problems.
Any recommendations for books or online tutorials would be great.
I am running it on SWI-Prolog on Ubuntu Linux.
In your definition of mult/3 the first two arguments have to be known. If one of them is still a variable, an instantiation error will occur. Eg. mult(2, X, 6) will yield an instantiation error, although X = 3 is a correct answer ; in fact, the only answer.
There are several options you have:
successor-arithmetics, constraints, or meta-logical predicates.
Here is a starting point with successor arithmetics:
add(0,Y,Y).
add(s(X),Y,s(Z)) :- add(X,Y,Z).
Another approach would be to use constraints over the integers. YAP and SWI have a library(clpfd) that can be used in a very flexible manner: Both for regular integer computations and the more general constraints. Of course, multiplication is already predefined:
?- A * B #= C.
A*B#=C.
?- A * B #= C, C = 6.
C = 6, A in -6.. -1\/1..6, A*B#=6, B in -6.. -1\/1..6.
?- A * B #= C, C = 6, A = 2.
A = 2, B = 3, C = 6.
Meta-logical predicates: I cannot recommend this option in which you would use var/1, nonvar/1, ground/1 to distinguish various cases and handle them differently. This is so error prone that I have rarely seen a correct program using them. In fact, even very well known textbooks contain serious errors!
I think you got the last two calls reversed. Don't you mean:
mult(X,Y,Z):- Y>1,Y1 is Y-1, Z1 is X+Z, mult(X,Y1,Z1).
Edit: nevermind that, looked at the code again and it doesn't make sense. I believe your original code is correct.
As for why that error is occuring, I need to know how you're calling the predicate. Can you give an example input?
The correct way of calling your predicate is mult(+X, +Y, ?Z):
?- mult(5,0,X).
X = 0
?- mult(5,1,X).
X = 5
?- mult(5,5,X).
X = 25
?- mult(4,4,16).
yes
?- mult(3,3,10).
no
etc. Calling it with a free variable in the first two arguments will produce that error, because one of them will be used in the right side of an is or in either side of the <, and those predicates expect ground terms to succeed.