Related
I want to test the hypothesis whether some 30 occurrences should fit a Poisson distribution.
#GNU Octave
X = [8 0 0 1 3 4 0 2 12 5 1 8 0 2 0 1 9 3 4 5 3 3 4 7 4 0 1 2 1 2]; #30 observations
bins = {0, 1, [2:3], [4:5], [6:20]}; #each bin can be single value or multiple values
I am trying to use Pearson's chi-square statistics here and coded the below function. I want a Poisson vector to contain corresponding Poisson probabilities for each bin and count the observations for each bin. I feel the loop is rather redundant and ugly. Can you please let me know how can I re-factor the function without the loop and make the whole calculation cleaner and more vectorized?
function result= poissonGoodnessOfFit(bins, observed)
assert(iscell(bins), "bins should be a cell array");
assert(all(cellfun("ismatrix", bins)) == 1, "bin entries either scalars or matrices");
assert(ismatrix(observed) && rows(observed) == 1, "observed data should be a 1xn matrix");
lambda_head = mean(observed); #poisson lambda parameter estimate
k = length(bins); #number of bin groups
n = length(observed); #number of observations
poisson_probability = []; #variable for poisson probability for each bin
observations = []; #variable for observation counts for each bin
for i=1:k
if isscalar(bins{1,i}) #this bin contains a single value
poisson_probability(1,i) = poisspdf(bins{1, i}, lambda_head);
observations(1, i) = histc(observed, bins{1, i});
else #this bin contains a range of values
inner_bins = bins{1, i}; #retrieve the range
inner_bins_k = length(inner_bins); #number of values inside
inner_poisson_probability = []; #variable to store individual probability of each value inside this bin
inner_observations = []; #variable to store observation counts of each value inside this bin
for j=1:inner_bins_k
inner_poisson_probability(1,j) = poisspdf(inner_bins(1, j), lambda_head);
inner_observations(1, j) = histc(observed, inner_bins(1, j));
endfor
poisson_probability(1, i) = sum(inner_poisson_probability, 2); #assign over the sum of all inner probabilities
observations(1, i) = sum(inner_observations, 2); #assign over the sum of all inner observation counts
endif
endfor
expected = n .* poisson_probability; #expected observations if indeed poisson using lambda_head
chisq = sum((observations - expected).^2 ./ expected, 2); #Pearson Chi-Square statistics
pvalue = 1 - chi2cdf(chisq, k-1-1);
result = struct("actual", observations, "expected", expected, "chi2", chisq, "pvalue", pvalue);
return;
endfunction
There's a couple of things worth noting in the code.
First, the 'scalar' case in your if block is actually identical to your 'range' case, since a scalar is simply a range of 1 element. So no special treatment is needed for it.
Second, you don't need to create such explicit subranges, your bin groups seem to be amenable to being used as indices into a larger result (as long as you add 1 to convert from 0-indexed to 1-indexed indices).
Therefore my approach would be to calculate the expected and observed numbers over the entire domain of interest (as inferred from your bin groups), and then use the bin groups themselves as 1-indices to obtain the desired subgroups, summing accordingly.
Here's an example code, written in the octave/matlab compatible subset of both languges:
function Result = poissonGoodnessOfFit( BinGroups, Observations )
% POISSONGOODNESSOFFIT( BinGroups, Observations) calculates the [... etc, etc.]
pkg load statistics; % only needed in octave; for matlab buy statistics toolbox.
assert( iscell( BinGroups ), 'Bins should be a cell array' );
assert( all( cellfun( #ismatrix, BinGroups ) ) == 1, 'Bin entries either scalars or matrices' );
assert( ismatrix( Observations ) && rows( Observations ) == 1, 'Observed data should be a 1xn matrix' );
% Define helpful variables
RangeMin = min( cellfun( #min, BinGroups ) );
RangeMax = max( cellfun( #max, BinGroups ) );
Domain = RangeMin : RangeMax;
LambdaEstimate = mean( Observations );
NBinGroups = length( BinGroups );
NObservations = length( Observations );
% Get expected and observed numbers per 'bin' (i.e. discrete value) over the *entire* domain.
Expected_Domain = NObservations * poisspdf( Domain, LambdaEstimate );
Observed_Domain = histc( Observations, Domain );
% Apply BinGroup values as indices
Expected_byBinGroup = cellfun( #(c) sum( Expected_Domain(c+1) ), BinGroups );
Observed_byBinGroup = cellfun( #(c) sum( Observed_Domain(c+1) ), BinGroups );
% Perform a Chi-Square test on the Bin-wise Expected and Observed outputs
O = Observed_byBinGroup; E = Expected_byBinGroup ; df = NBinGroups - 1 - 1;
ChiSquareTestStatistic = sum( (O - E) .^ 2 ./ E );
PValue = 1 - chi2cdf( ChiSquareTestStatistic, df );
Result = struct( 'actual', O, 'expected', E, 'chi2', ChiSquareTestStatistic, 'pvalue', PValue );
end
Running with your example gives:
X = [8 0 0 1 3 4 0 2 12 5 1 8 0 2 0 1 9 3 4 5 3 3 4 7 4 0 1 2 1 2]; % 30 observations
bins = {0, 1, [2:3], [4:5], [6:20]}; % each bin can be single value or multiple values
Result = poissonGoodnessOfFit( bins, X )
% Result =
% scalar structure containing the fields:
% actual = 6 5 8 6 5
% expected = 1.2643 4.0037 13.0304 8.6522 3.0493
% chi2 = 21.989
% pvalue = 0.000065574
A general comment about the code; it is always preferable to write self-explainable code, rather than code that does not make sense by itself in the absence of a comment. Comments generally should only be used to explain the 'why', rather than the 'how'.
I'm still new with the vba. I have an database that with the help of others peoples I've finally able to do validation check when importing. However, I can get check done with numbers as text, but if I need alpha character or if the cell is blank I'm stuck. This is what I have for numbers as text. I need two checks: 1) accept alphanumeric or blank (null) and 2)numeric or blank (null).
Function chk2(A As String) As Boolean
Dim i As Integer, l As Integer, c As String
l = Len(A)
If l = 4 Then
chk2 = True
For i = 1 To l
c = Mid(A, i, 1)
If Not (c >= "0" And c <= "9") Then
chk2 = False
Exit Function
End If
Next i
End If
End Function
This one works fine as long as there are characters to fill in each row/cell.
Thanks in advance for your help.
If you're returning a boolean value, you should only return a true value if everything has executed correctly, that way you aren't getting a false positive if something fails.
You could use a check if the cell has 0 length for a blank cell, i.e. If Len = 0
What you are doing in your If Not statement is checking if the ascii value of c is between the ascii values of 0 and 9 so you can use or statements to check if it is between a and z or A and Z or if there is a space which is character 32 - Chr(32) :
Function chk2(A As String) As Boolean
Dim i As Integer, l As Integer, c As String
chk2 = false
l = Len(A)
If l = 0 then
'do something if the cell is blank
chk2 = true
ElseIf l = 4 Then
For i = 1 To l
c = Mid(A, i, 1)
If Not ((c >= "0" And c <= "9") Or (c >= "a" And c <= "z") Or (c >= "A" And C <= "Z") Or c = Chr(32)) Then
Exit Function
End If
Next i
chk2=true
End If
End Function
Im trying to cycle through certain rows in my excel spreadsheet. for the first group im trying to cycle through every 3 rows to see if its hidden and for the second for loop I am stepping through every 2. I basically want to add whats true through both loops and return that value. the "Return y" part is giving me an error.
Function FindHiddenRows() As Integer
Dim x As Integer
Dim y As Integer
y = 0
For x = 23 To 38 Step 3
If Rows("x:x").EntireRow.Hidden = False Then
y = y + 1
End If
Next x
For x = 40 To 46 Step 2
If Rows("x:x").EntireRow.Hidden = False Then
y = y + 1
End If
Next x
Return y
End Function
to make it fast / short / easy:
Function FindHiddenRows() As Byte
Dim x As Byte, y As Byte
For x = 22 To 46 Step 2
If x < 38 Then x = x + 1
If Not Rows(x).Hidden Then y = y + 1
Next
FindHiddenRows = y
End Function
A little new here! I am trying to calculate the Mahalanobis distance between 2 vectors, for which I need to calculate the covariance matrix between 2 vectors.
I have the following code to do this: basically the vectors are row vectors of size 1x8..these are stored in the variables x1 and x2. A difference of these 2 is then taken and stored in diff.
There is a separate function to calculate the covariance matrix and this is stored in covar1.
However when I execute this, I get a runtime error 1004 on the line calculating the inverse of the covariance matrix:
covarinv = WorksheetFunction.MInverse(covar1)
Any help on this would be immensely appreciated!
Thank you.
Sub calculat()
Dim x1() As Variant
Dim x2() As Variant
Dim diff() As Variant
Dim covar1 As Variant
Dim covarinv As Variant
Dim md() As Variant
x1 = Range("b3:i3")
x2 = Range("b4:i4")
n1 = UBound(x1, 1)
n2 = UBound(x1, 2)
m1 = UBound(x2, 1)
m2 = UBound(x2, 2)
ReDim diff(1 To n1, 1 To n2)
For j = 1 To n1
For i = 1 To n2
diff(j, i) = x1(j, i) - x2(j, i)
Next i
Next j
covar1 = VarCov(Range("b3:i7"))
covarinv = WorksheetFunction.MInverse(covar1)
temp = WorksheetFunction.MMult(diff, covarinv)
difft = WorksheetFunction.Transpose(diff)
md = WorksheetFunction.MMult(temp, difft)
End Sub
Function VarCov(rng As Range) As Variant
Dim i As Integer
Dim j As Integer
Dim colnum As Integer
Dim matrix() As Double
colnum = rng.Columns.Count
ReDim matrix(colnum - 1, colnum - 1)
For i = 1 To colnum
For j = 1 To colnum
matrix(i - 1, j - 1) = Application.WorksheetFunction.covar(rng.Columns(i), rng.Columns(j))
Next j
Next i
VarCov = matrix
End Function
How can I represent integer as Binary?
so I can print 7 as 111
You write a function to do this.
num=7
function toBits(num)
-- returns a table of bits, least significant first.
local t={} -- will contain the bits
while num>0 do
rest=math.fmod(num,2)
t[#t+1]=rest
num=(num-rest)/2
end
return t
end
bits=toBits(num)
print(table.concat(bits))
In Lua 5.2 you've already have bitwise functions which can help you ( bit32 )
Here is the most-significant-first version, with optional leading 0 padding to a specified number of bits:
function toBits(num,bits)
-- returns a table of bits, most significant first.
bits = bits or math.max(1, select(2, math.frexp(num)))
local t = {} -- will contain the bits
for b = bits, 1, -1 do
t[b] = math.fmod(num, 2)
num = math.floor((num - t[b]) / 2)
end
return t
end
There's a faster way to do this that takes advantage of string.format, which converts numbers to base 8. It's trivial to then convert base 8 to binary.
--create lookup table for octal to binary
oct2bin = {
['0'] = '000',
['1'] = '001',
['2'] = '010',
['3'] = '011',
['4'] = '100',
['5'] = '101',
['6'] = '110',
['7'] = '111'
}
function getOct2bin(a) return oct2bin[a] end
function convertBin(n)
local s = string.format('%o', n)
s = s:gsub('.', getOct2bin)
return s
end
If you want to keep them all the same size, then do
s = string.format('%.22o', n)
Which gets you 66 bits. That's two extra bits at the end, since octal works in groups of 3 bits, and 64 isn't divisible by 3. If you want 33 bits, change it to 11.
If you have the BitOp library, which is available by default in LuaJIT, then you can do this:
function convertBin(n)
local t = {}
for i = 1, 32 do
n = bit.rol(n, 1)
table.insert(t, bit.band(n, 1))
end
return table.concat(t)
end
But note this only does the first 32 bits! If your number is larger than 2^32, the result wont' be correct.
function bits(num)
local t={}
while num>0 do
rest=num%2
table.insert(t,1,rest)
num=(num-rest)/2
end return table.concat(t)
end
Since nobody wants to use table.insert while it's useful here
Here is a function inspired by the accepted answer with a correct syntax which returns a table of bits in wriiten from right to left.
num=255
bits=8
function toBits(num, bits)
-- returns a table of bits
local t={} -- will contain the bits
for b=bits,1,-1 do
rest=math.fmod(num,2)
t[b]=rest
num=(num-rest)/2
end
if num==0 then return t else return {'Not enough bits to represent this number'}end
end
bits=toBits(num, bits)
print(table.concat(bits))
>>11111111
function reverse(t)
local nt = {} -- new table
local size = #t + 1
for k,v in ipairs(t) do
nt[size - k] = v
end
return nt
end
function tobits(num)
local t={}
while num>0 do
rest=num%2
t[#t+1]=rest
num=(num-rest)/2
end
t = reverse(t)
return table.concat(t)
end
print(tobits(7))
# 111
print(tobits(33))
# 100001
print(tobits(20))
# 10100
local function tobinary( number )
local str = ""
if number == 0 then
return 0
elseif number < 0 then
number = - number
str = "-"
end
local power = 0
while true do
if 2^power > number then break end
power = power + 1
end
local dot = true
while true do
power = power - 1
if dot and power < 0 then
str = str .. "."
dot = false
end
if 2^power <= number then
number = number - 2^power
str = str .. "1"
else
str = str .. "0"
end
if number == 0 and power < 1 then break end
end
return str
end
May seem more verbose but it is actually faster than other functions that use the math library functions. Works with any number, be it positive/negative/fractional...
local function tobits(num, str) -- tail call
str = str or "B"
if num == 0 then return str end
return tobits(
num >> 1 , -- right shift
((num & 1)==1 and "1" or "0") .. str )
end
This function uses a lookup table to print a binary number extracted from a hex representation. All using string manipulation essentially. Tested in lua 5.1.
local bin_lookup = {
["0"] = "0000",
["1"] = "0001",
["2"] = "0010",
["3"] = "0011",
["4"] = "0100",
["5"] = "0101",
["6"] = "0110",
["7"] = "0111",
["8"] = "1000",
["9"] = "1001",
["A"] = "1010",
["B"] = "1011",
["C"] = "1100",
["D"] = "1101",
["E"] = "1110",
["F"] = "1111"
}
local print_binary = function(value)
local hs = string.format("%.2X", value) -- convert number to HEX
local ln, str = hs:len(), "" -- get length of string
for i = 1, ln do -- loop through each hex character
local index = hs:sub(i, i) -- each character in order
str = str .. bin_lookup[index] -- lookup a table
str = str .. " " -- add a space
end
return str
end
print(print_binary(45))
#0010 1101
print(print_binary(65000))
#1111 1101 1110 1000
This maybe not work in lua that has no bit32 library
function toBinary(number, bits)
local bin = {}
bits = bits - 1
while bits >= 0 do --As bit32.extract(1, 0) will return number 1 and bit32.extract(1, 1) will return number 0
--I do this in reverse order because binary should like that
table.insert(bin, bit32.extract(number, bits))
bits = bits - 1
end
return bin
end
--Expected result 00000011
print(table.concat(toBinary(3, 8)))
This need at least lua 5.2 (because the code need bit32 library)
As by Dave, but with filled empty bits:
local function toBits(num, bits)
-- returns a table of bits, least significant first.
local t={} -- will contain the bits
bits = bits or 8
while num>0 do
rest=math.fmod(num,2)
t[#t+1]=rest
num=math.floor((num-rest)/2)
end
for i = #t+1, bits do -- fill empty bits with 0
t[i] = 0
end
return t
end
for i = 0, 255 do
local bits = toBits(i)
print(table.concat(bits, ' '))
end
Result:
0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
1 0 1 0 0 0 0 0
...
0 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1