How to change the parameters of the binomial distribution used in a binomial glmer? - lme4

I wish to assess how being part of a majority helps in emerging as a leader of an animal group.
Say I have 10 cases in which I assessed whether the leader came from the majority or the minority.
Leader <- c(1,1,1,1,0,1,1,1,0,1,0,0,0,0,1,0,0,0,1,0)
Case <- as.factor(c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10))
Majority <- as.factor(c("Maj","Maj","Maj","Maj","Maj","Maj","Maj","Maj","Maj","Maj",
"Min","Min","Min","Min","Min","Min","Min","Min","Min","Min"))
leadMaj <- data.frame(Leader,Case,Majority)
binomial.glmer <- glmer(Leader ~ Majority + (1|Case),
family = binomial, data = leadMaj)
summary(binomial.glmer)
The outcome says that being from the minority drastically decreases the probability to lead the group
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [glmerMod]
Family: binomial ( logit )
Formula: Leader ~ Majority + (1 | Case)
Data: leadMaj
AIC BIC logLik deviance df.resid
26 29 -10 20 17
Scaled residuals:
Min 1Q Median 3Q Max
-2.0 -0.5 0.0 0.5 2.0
Random effects:
Groups Name Variance Std.Dev.
Case (Intercept) 0 0
Number of obs: 20, groups: Case, 10
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.3863 0.7906 1.754 0.0795 .
MajorityMin -2.7726 1.1180 -2.480 0.0131 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
MajorityMin -0.707
However, the groups were composed of 8 individuals in the majority and 2 individuals in the minority. We can see that in 80% of the cases the majority led, which is what is expected.
So the question is: how can I include the binomial distribution with p=0.8, and not p=0.5?

Related

What does mask r-cnn's AP, AP50, AP70 mean?

I'm novice on r-cnn.
There are term AP, AP50, AP75 on mask r-cnn paper.
50, 75 is small postfix, but I can't make it small, sorry.
Anyway the paper says it is averaged over IOU thresholds.
For AP50, only candidates over 50% region comparing ground truth are counted,
and for AP75 only candidate over 75% are counted. Then what is empty AP? I knew AP use 70% IoU threshold, but it wouldn't. Because just AP is lower then AP75.
And additionally, there are another terms I don't understand well.
They are APs, m, L. I know they mean small, medium, large. Then is there any criteria how big they are. Just saying small, medium, large makes me little bit confused.
Thanks in advance!
I found them from below:
http://cocodataset.org/#detections-eval
Average Precision (AP):
AP % AP at IoU=.50:.05:.95 (primary challenge metric)
APIoU=.50 % AP at IoU=.50 (PASCAL VOC metric)
APIoU=.75 % AP at IoU=.75 (strict metric)
AP Across Scales:
APsmall % AP for small objects: area < 322
APmedium % AP for medium objects: 322 < area < 962
APlarge % AP for large objects: area > 962
Average Recall (AR):
ARmax=1 % AR given 1 detection per image
ARmax=10 % AR given 10 detections per image
ARmax=100 % AR given 100 detections per image
AR Across Scales:
ARsmall % AR for small objects: area < 322
ARmedium % AR for medium objects: 322 < area < 962
ARlarge % AR for large objects: area > 962
Thanks.

Number of samples for doing FFT

I have a set of 10006 samples which resembles 10 period s of a 50 hz signal which is sampled with 50 KHZ.
as you know the freqeuncy of bins are calculated via SF/N where SF is sampling frequency and N is the number of samples.
I want to have the magnitudes of the frequency in integer multiples of 5 HZ up to 9 KHZ (for example: 5 , 10 , ..., 1025, 1030...,8000, 80005..9000).
so if I do the fft with 10006 samples my frequency bins are not any more the integer multiples of 5 and instead they are integer multiples of 50000/10006.
and if I truncate my samples then i will have integer multiples of 5 Hz bins but my samples are not any more resembling exactly 10 periods which means I have leakge effec !
so I am wondering how to have exactlu 5 HZ bins and with out having the spectrum distorted by leakage effect ?!!
You can truncate and then window to reduce "leakage" effects. Or zero-pad to 20k points without windowing and resample the FFT result.

Return Base 9 equivalent formula

I have been tackling an exercise given to us by our instructor which is to return the "base 9" equivalent of an inputted number.
The input number is: 231085 and the
return number is: 382871.
I have no idea how he came up with that so called "base 9" equivalent.
I tried looking for the formula on how to get the base 9 equivalent in the web but they were to difficult for me to understand, plus the fact that I am very weak in Math and Algebra.
I tried using modulo and division to solve it and came up with nothing (of course, my formula was wrong).
I'm really dumbfounded on this problem and I would appreciate it if anyone can enlighten me on the formula to solve it.
Or maybe the answer or the problem itself is all wrong?
Cheers!
The base-9 numbering system is a system that uses nine digits to represent numbers. That is,
231,085 = 2 × 105
+ 3 × 104
+ 1 × 103
+ 0 × 102
+ 8 × 101
+ 5 × 100
in the base-10 system, a.k.a. the decimal numbering system. But in the base-9 system, you write it terms of whole multiples of powers of 9, instead of powers of 10 as shown above:
381,881 = 3 × 95
+ 8 × 94
+ 1 × 93 (Your instructor gave you the wrong number, btw. It's 381,881 not 382,871)
+ 8 × 92
+ 8 × 91
+ 1 × 90
Note that the coefficients of the powers of 10 in the base-10 representation (i.e., the 2, 3, 1, 0, 8, and 5) are always one of the ten decimal digits (zero through nine). Likewise, the coefficients of the powers of 9 in the base-9 representation (the 3, 8, 1, 8, 8, 1) are always one of the nine decimal digits (zero through eight). Anything more and you'd have to carry it over, like you learned in the addition of multi-digit numbers in elementary school.
Now, for the algorithm to convert the base-10 representation to base-9, first take a look at Converting a decimal number into binary which converts from base-10 to base-2. The only difference is that you'd divide by powers of 9, instead of powers of 2 as this question does.
Following the example in the linked question,
[231085] [53938] [1450] [721] [73] [1]
÷59049 ÷6561 ÷729 ÷81 ÷9 ÷1
[3] [8] [1] [8] [8] [1]
If you want to systematically break down a base-10 integer into its digits, you'd follow this pattern:
Divide the number by 10 (the base).
The remainder of the division will be the next least significant digit.
Repeat with the new divided number (i.e. the quotient of the division of step 1) until the quotient reaches 0.
So, for 231,085, the iterations are as follows:
Step: 1 2 3 4 5 6
-------------------------------------------------------------
Number: 231,085 23,108 2,310 231 23 2
÷10 ÷10 ÷10 ÷10 ÷10 ÷10
-------------------------------------------------------------
Quotient: 23,108 2,310 231 23 2 0 <-- Quotient reached 0, so stop
Remainder: 5 8 0 1 3 2
As you can see, the remainder in each step is the next least significant digit in the number 231,085. That means 5 is the least significant digit. Then comes 8, which is really 8 × 10 = 80, and 10 > 1; then 0 × 100, and 100 > 10, etc.
Now if you were to divide by 9 in each step instead of by 10 as above, then the table would look something like
Step: 1 2 3 4 5 6
-------------------------------------------------------------
Number: 231,085 25,676 2,852 316 35 3
÷9 ÷9 ÷9 ÷9 ÷9 ÷9
-------------------------------------------------------------
Quotient: 25,676 2,852 316 35 3 0
Remainder: 1 8 8 1 8 3
And now the remainders are in reverse order of the base-9 representation of the base-10 number 231,085.
This answer doesn't actually give you the code for the base conversion, but the basic logic is outlined above, and the algorithm exists all over the internet (maybe for different bases, but all you need to change is the base in the division).
Your instructor's answer is incorrect.
http://www.wolframalpha.com/input/?i=231085+in+base+9

implementation of the Gower distance function

I have a matrix (size: 28 columns and 47 rows) with numbers. This matrix has an extra row that is contains headers for the columns ("ordinal" and "nominal").
I want to use the Gower distance function on this matrix. Here says that:
The final dissimilarity between the ith and jth units is obtained as a weighted sum of dissimilarities for each variable:
d(i,j) = sum_k(delta_ijk * d_ijk ) / sum_k( delta_ijk )
In particular, d_ijk represents the distance between the ith and jth unit computed considering the kth variable. It depends on the nature of the variable:
factor or character columns are
considered as categorical nominal
variables and d_ijk = 0 if
x_ik =x_jk, 1 otherwise;
ordered columns are considered as
categorical ordinal variables and
the values are substituted with the
corresponding position index, r_ik in
the factor levels. These position
indexes (that are different from the
output of the R function rank) are
transformed in the following manner
z_ik = (r_ik - 1)/(max(r_ik) - 1)
These new values, z_ik, are treated as observations of an
interval scaled variable.
As far as the weight delta_ijk is concerned:
delta_ijk = 0 if x_ik = NA or x_jk =
NA;
delta_ijk = 1 in all the other cases.
I know that there is a gower.dist function, but I must do it that way.
So, for "d_ijk", "delta_ijk" and "z_ik", I tried to make functions, as I didn't find a better way.
I started with "delta_ijk" and i tried this:
Delta=function(i,j){for (i in 1:28){for (j in 1:47){
+{if (MyHeader[i,j]=="nominal")
+ result=0
+{else if (MyHeader[i,j]=="ordinal") result=1}}}}
+;result}
But I got error. So I got stuck and I can't do the rest.
P.S. Excuse me if I make mistakes, but English is not a language I very often.
Why do you want to reinvent the wheel billyt? There are several functions/packages in R that will compute this for you, including daisy() in package cluster which comes with R.
First things first though, get those "data type" headers out of your data. If this truly is a matrix then character information in this header row will make the whole matrix a character matrix. If it is a data frame, then all columns will likely be factors. What you want to do is code the type of data in each column (component of your data frame) as 'factor' or 'ordered'.
df <- data.frame(A = c("ordinal",1:3), B = c("nominal","A","B","A"),
C = c("nominal",1,2,1))
Which gives this --- note that all are stored as factors because of the extra info.
> head(df)
A B C
1 ordinal nominal nominal
2 1 A 1
3 2 B 2
4 3 A 1
> str(df)
'data.frame': 4 obs. of 3 variables:
$ A: Factor w/ 4 levels "1","2","3","ordinal": 4 1 2 3
$ B: Factor w/ 3 levels "A","B","nominal": 3 1 2 1
$ C: Factor w/ 3 levels "1","2","nominal": 3 1 2 1
If we get rid of the first row and recode into the correct types, we can compute Gower's coefficient easily.
> headers <- df[1,]
> df <- df[-1,]
> DF <- transform(df, A = ordered(A), B = factor(B), C = factor(C))
> ## We've previously shown you how to do this (above line) for lots of columns!
> str(DF)
'data.frame': 3 obs. of 3 variables:
$ A: Ord.factor w/ 3 levels "1"<"2"<"3": 1 2 3
$ B: Factor w/ 2 levels "A","B": 1 2 1
$ C: Factor w/ 2 levels "1","2": 1 2 1
> require(cluster)
> daisy(DF)
Dissimilarities :
2 3
3 0.8333333
4 0.3333333 0.8333333
Metric : mixed ; Types = O, N, N
Number of objects : 3
Which gives the same as gower.dist() for this data (although in a slightly different format (as.matrix(daisy(DF))) would be equivalent):
> gower.dist(DF)
[,1] [,2] [,3]
[1,] 0.0000000 0.8333333 0.3333333
[2,] 0.8333333 0.0000000 0.8333333
[3,] 0.3333333 0.8333333 0.0000000
You say you can't do it this way? Can you explain why not? As you seem to be going to some degree of effort to do something that other people have coded up for you already. This isn't homework, is it?
I'm not sure what your logic is doing, but you are putting too many "{" in there for your own good. I generally use the {} pairs to surround the consequent-clause:
Delta=function(i,j){for (i in 1:28) {for (j in 1:47){
if (MyHeader[i,j]=="nominal") {
result=0
# the "{" in the next line before else was sabotaging your efforts
} else if (MyHeader[i,j]=="ordinal") { result=1} }
result}
}
Thanks Gavin and DWin for your help. I managed to solve the problem and find the right distance matrix. I used daisy() after I recoded the class of the data and it worked.
P.S. The solution that you suggested at my other topic for changing the class of the columns:
DF$nominal <- as.factor(DF$nominal)
DF$ordinal <- as.ordered(DF$ordinal)
didn't work. It changed only the first nominal and ordinal column.
Thanks again for your help.

What is the "biggest" negative number on a 4-bit machine?

Or, what is the range of numbers that can be represented on a 4-bit machine using 2s-complement?
That would be -8 to +7
The range is -8 to 7, or 1000 to 0111. You can see the full range here.
4 bits (using 2's complement) will give you a range from -8 to 7.
This should be straightforward to work out yourself.
Range in twos complement will be:
-1 * 2 ^ (bits - 1)
to
2 ^ (bits - 1) - 1
So for 4 bits:
-1 * 2 ^ (4 - 1) = -1 * 2 ^ 3 = -8
to
2 ^ (4 - 1) - 1 = 2 ^ 3 - 1 = 7
Also, if you are interested and for others maybe browsing this question -
twos complement is used for easy binary arithmetic:
to add - you just add the two numbers without conversion and disregard the overflow:
-6 + 7 = 1
is
1010 = -6
0111 = 7
------
(1)0001 = 1 (ignoring the overflow)
...and more yet - to convert a negative binary number to its opposite positive number:
if the sign bit (highest order bit) is 1, indicating negative... reading from least significant to most significant bit (right to left), preserve every bit up through the first "1", then invert every bit after that.
So, with 8 bit
10011000 .. becomes
01101000 (* -1) = 104 * -1 = -104
and that is why 10000000 is your lowest negative number (or in X bit 1000.all zeroes..000), it translates to unsigned 10000000 * -1 = -128
Maybe a long winded answer but to those without the 1s and 0s background I figure it is useful
Well let's dissect this question.
Firstly, the question should be framed as - "The least*(because it is negative, so biggest is not the right usage)* possible negative number to be stored in 4-bit would be?"
Numbers are of two types -
Signed (holds both + and - numbers )
Unsigned (holds only + numbers)
We will be using binary representation to understand this.
For Unsigned -
4-bit minimum number = 0000 (0)
4-bit maximum number = 1111 (255)
So range would be Range : 0-15
For Signed 4-bits, first bit represent sign(+/-) and rest 3-bits represent number.
4-bit minimum will be a negative number.
So replace the first bit(MSB) with 1 in 0000(least unsigned number), making it 1000.
Calculate decimal equivalent of 1000 = 1*2^3 = 8
So, number would be -8 (- as we have 1 as the first bit in 1000)
4-bit maximum will be a positive number.
So replace the first bit(MSB) with 0 in 1111(largest unsigned number), making it 0111.
Calculate decimal equivalent of 0111 = 1*2^2 + 1*2^1 + 1*2^0 = 7
So, number would be +7 (+ as we have 0 as the first bit in 0111)
Range would be -8 to +7.