Iterate over each row in matrix Octave - octave

How do I iterate over each row in Z where Z is a 2 * m matrix:
6.1101,17.592
5.5277,9.1302
8.5186,13.662
How do I access each Z(i)(j) inside this loop?
For example:
for i = z
fprintf('Iterating over row: '+ i);
disp (i:1);
disp (i:2);
end
Would output:
Iterating over row: 1
6.1101
17.592
Iterating over row: 2
5.5277
9.1302
Iterating over row: 3
8.5186
13.662

If you use for i = z when z is a matrix, then i takes the value of the first column of z (6.1101; 5.5277; 8.5186), then the second column and so on. See octave manual: The-for-Statement
If you want to iterate over all elements you could use
z = [6.1101,17.592;5.5277,9.1302;8.5186,13.662]
for i = 1:rows(z)
for j = 1:columns(z)
printf("z(%d,%d) = %f\n", i, j, z(i,j));
endfor
endfor
which outputs:
z(1,1) = 6.110100
z(1,2) = 17.592000
z(2,1) = 5.527700
z(2,2) = 9.130200
z(3,1) = 8.518600
z(3,2) = 13.662000
But keep in mind that for loops are slow in octave so it may be desirable to use a vectorized method. Many functions can use a matrix input for most common calculations.
For example if you want to calculate the overall sum:
octave> sum (z(:))
ans = 60.541
Or the difference between to adjacent rows:
octave> diff (z)
ans =
-0.58240 -8.46180
2.99090 4.53180

You can transpose the matrix first and then do a for statement like so:
for i = z'
disp(i(1))
disp(i(2))
end
Although in this case you won't have an index stating which row you are using

Related

Trying to find index of minimum value in a list of vars in Octave

I have a list of vars with different values
a = 2
b = 1
c= 12343243
d = 8998
Can find the smallest value
aSmallestVALUE = min([a, b, c, d])
and index
[v,idx]=min([a, b, c, d])
I want to find the index of variable and sort this list from 0 to up
something like the
sorted list = b, a, d, c
Obviously if you want to treat those four variables as a 'list' to be sorted, you need to be working with a 'list' construct, not 4 isolated variables.
L = [2, 1, 12343243, 8998];
Otherwise it makes no sense to talk about the 'index' of an existing independent variable (though obviously you can construct this L from a bunch of pre-existing variables if desired).
With L in hand, you can now do
[minval, idx] = min( L )
% minval = 1
% idx = 2
to find the minimum and its corresponding index, and
[sorted, sortedindices] = sort( L )
% sorted =
% 1.0000e+00 2.0000e+00 8.9980e+03 1.2343e+07
%
% sortedindices =
% 2 1 4 3
to obtain a sorted array, with corresponding indices.

Problem with Octave. Can't recognize a matrix

I try to build a script on Octave and I receive this message:
error: script2: =: nonconformant arguments (op1 is 1x1, op2 is 1x10)
error: called from
script2 at line 5 column 1
My script is:
l = 20:29;
m = 30;
for i = 0:9
a(i + 1) = l / m;
end
Can someone help me fix this?
Octave allows you to assign to a non-existent name by making a scalar. You can then append to it by assigning to an index that is one past the length.
When you assign to a(1), a is created as a scalar (or 1x1 array). l / m is 1x10. That is what your error message is telling you.
There are a couple of workarounds. If you want to just accumulate the rows of a matrix, add a second dimension:
a(i + 1, :) = l / m;
If you want columns:
a(:, i + 1) = l / m;
The problem with this approach is that it reallocates the matrix at every iteration. The recommended approcach is to pre-allocate the matrix a and fill it in:
l = 20:29;
m = 30;
a = zeros(10);
for i = 1:10
a(i + 1, :) = l / m;
end
Since Octave is capable of doing matrix operations, you don't need the for loop in the first place.
I would rather write:
l = 20:29;
m = 30;
a = l / m;
This is much more efficient.

How to calculate a probability vector and an observation count vector for a range of bins?

I want to test the hypothesis whether some 30 occurrences should fit a Poisson distribution.
#GNU Octave
X = [8 0 0 1 3 4 0 2 12 5 1 8 0 2 0 1 9 3 4 5 3 3 4 7 4 0 1 2 1 2]; #30 observations
bins = {0, 1, [2:3], [4:5], [6:20]}; #each bin can be single value or multiple values
I am trying to use Pearson's chi-square statistics here and coded the below function. I want a Poisson vector to contain corresponding Poisson probabilities for each bin and count the observations for each bin. I feel the loop is rather redundant and ugly. Can you please let me know how can I re-factor the function without the loop and make the whole calculation cleaner and more vectorized?
function result= poissonGoodnessOfFit(bins, observed)
assert(iscell(bins), "bins should be a cell array");
assert(all(cellfun("ismatrix", bins)) == 1, "bin entries either scalars or matrices");
assert(ismatrix(observed) && rows(observed) == 1, "observed data should be a 1xn matrix");
lambda_head = mean(observed); #poisson lambda parameter estimate
k = length(bins); #number of bin groups
n = length(observed); #number of observations
poisson_probability = []; #variable for poisson probability for each bin
observations = []; #variable for observation counts for each bin
for i=1:k
if isscalar(bins{1,i}) #this bin contains a single value
poisson_probability(1,i) = poisspdf(bins{1, i}, lambda_head);
observations(1, i) = histc(observed, bins{1, i});
else #this bin contains a range of values
inner_bins = bins{1, i}; #retrieve the range
inner_bins_k = length(inner_bins); #number of values inside
inner_poisson_probability = []; #variable to store individual probability of each value inside this bin
inner_observations = []; #variable to store observation counts of each value inside this bin
for j=1:inner_bins_k
inner_poisson_probability(1,j) = poisspdf(inner_bins(1, j), lambda_head);
inner_observations(1, j) = histc(observed, inner_bins(1, j));
endfor
poisson_probability(1, i) = sum(inner_poisson_probability, 2); #assign over the sum of all inner probabilities
observations(1, i) = sum(inner_observations, 2); #assign over the sum of all inner observation counts
endif
endfor
expected = n .* poisson_probability; #expected observations if indeed poisson using lambda_head
chisq = sum((observations - expected).^2 ./ expected, 2); #Pearson Chi-Square statistics
pvalue = 1 - chi2cdf(chisq, k-1-1);
result = struct("actual", observations, "expected", expected, "chi2", chisq, "pvalue", pvalue);
return;
endfunction
There's a couple of things worth noting in the code.
First, the 'scalar' case in your if block is actually identical to your 'range' case, since a scalar is simply a range of 1 element. So no special treatment is needed for it.
Second, you don't need to create such explicit subranges, your bin groups seem to be amenable to being used as indices into a larger result (as long as you add 1 to convert from 0-indexed to 1-indexed indices).
Therefore my approach would be to calculate the expected and observed numbers over the entire domain of interest (as inferred from your bin groups), and then use the bin groups themselves as 1-indices to obtain the desired subgroups, summing accordingly.
Here's an example code, written in the octave/matlab compatible subset of both languges:
function Result = poissonGoodnessOfFit( BinGroups, Observations )
% POISSONGOODNESSOFFIT( BinGroups, Observations) calculates the [... etc, etc.]
pkg load statistics; % only needed in octave; for matlab buy statistics toolbox.
assert( iscell( BinGroups ), 'Bins should be a cell array' );
assert( all( cellfun( #ismatrix, BinGroups ) ) == 1, 'Bin entries either scalars or matrices' );
assert( ismatrix( Observations ) && rows( Observations ) == 1, 'Observed data should be a 1xn matrix' );
% Define helpful variables
RangeMin = min( cellfun( #min, BinGroups ) );
RangeMax = max( cellfun( #max, BinGroups ) );
Domain = RangeMin : RangeMax;
LambdaEstimate = mean( Observations );
NBinGroups = length( BinGroups );
NObservations = length( Observations );
% Get expected and observed numbers per 'bin' (i.e. discrete value) over the *entire* domain.
Expected_Domain = NObservations * poisspdf( Domain, LambdaEstimate );
Observed_Domain = histc( Observations, Domain );
% Apply BinGroup values as indices
Expected_byBinGroup = cellfun( #(c) sum( Expected_Domain(c+1) ), BinGroups );
Observed_byBinGroup = cellfun( #(c) sum( Observed_Domain(c+1) ), BinGroups );
% Perform a Chi-Square test on the Bin-wise Expected and Observed outputs
O = Observed_byBinGroup; E = Expected_byBinGroup ; df = NBinGroups - 1 - 1;
ChiSquareTestStatistic = sum( (O - E) .^ 2 ./ E );
PValue = 1 - chi2cdf( ChiSquareTestStatistic, df );
Result = struct( 'actual', O, 'expected', E, 'chi2', ChiSquareTestStatistic, 'pvalue', PValue );
end
Running with your example gives:
X = [8 0 0 1 3 4 0 2 12 5 1 8 0 2 0 1 9 3 4 5 3 3 4 7 4 0 1 2 1 2]; % 30 observations
bins = {0, 1, [2:3], [4:5], [6:20]}; % each bin can be single value or multiple values
Result = poissonGoodnessOfFit( bins, X )
% Result =
% scalar structure containing the fields:
% actual = 6 5 8 6 5
% expected = 1.2643 4.0037 13.0304 8.6522 3.0493
% chi2 = 21.989
% pvalue = 0.000065574
A general comment about the code; it is always preferable to write self-explainable code, rather than code that does not make sense by itself in the absence of a comment. Comments generally should only be used to explain the 'why', rather than the 'how'.

Weighted sample in Octave

I've been searching everywhere and found nothing. Is there any way of performing a weighted sample in Octave?
That is, if we have two vectors e and v, where sum(v) = 1, a way of sampling n elements from e with probabilies v.
You want to determine an index according to probability in vector 'v' then pick the corresponding index into 'e'. Thus, you need to use a Inverse transform sampling. A simple way to do it is for instance:
clear
close all
e = [10 20 30 40 50];
v = [0.1 0.2 0.5 0.1 0.1];
cdf = cumsum(v);
n = 1000;
E = [];
for i=1:n
r = rand;
idx = find(cdf>r);
E = [E e( idx(1) )];
end
hist(E)

How can I prove the correctness of the following algorithm?

Consider the following algorithm min which takes lists x,y as parameters and returns the zth smallest element in union of x and y.
Pre conditions: X and Y are sorted lists of ints in increasing order and they are disjoint.
Notice that its pseudo code, so indexing starts with 1 not 0.
Min(x,y,z):
if z = 1:
return(min(x[1]; y[1]))
if z = 2:
if x[1] < y[1]:
return(min(x[2],y[1]))
else:
return(min(x[1], y[2]))
q = Ceiling(z/2) //round up z/2
if x[q] < y[z-q + 1]:
return(Min(x[q:z], y[1:(z - q + 1)], (z-q +1)))
else:
return(Min(x[1:q], B[(z -q + 1):z], q))
I can prove that it terminates, because z keeps decreasing by 2 and will eventually reach one of the base cases but I cant prove the partial correctness.
Your code is not correct.
Consider the following input:
x = [0,1]
y = [2]
z = 3
You then get q = 2 and, in the if clause that follows, access y[z-q+1], i.e. y[2]. This is an array bounds violation.