Simulation from a nested glmer model - lme4

I'm having a problem generating simulations from a 3 level glmer model when conditioning on the random effects (I'm actually using predict via bootMer but the problem is the same).
This works:
library(lme4)
fit1 = glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
data = cbpp, family = binomial)
simulate(fit1, re.form=NULL)
This fails:
cbpp$bigherd = rep(1:7, 8)
fit2 = glmer(cbind(incidence, size - incidence) ~ period + (1 | bigherd / herd),
data = cbpp, family = binomial)
simulate(fit2, re.form=NULL)
Error: No random effects terms specified in formula
Many thanks for any ideas.
Update
Ben, many thanks for your help below, really appreciate it. I wonder if I can impose on you again.
What I want to do is simulate predictions on the response scale and I'm not sure if I can use your work around? Or if there is an alternative to what I'm doing. Thank you!
This works as expected, but is not conditional on random effects:
FUN = function(.){
predict(., type="response")
}
bootMer(fit2, FUN, nsim=3)$t
This doesn't work, as would be expected given above problem:
bootMer(fit2, FUN, nsim=3, use.u=TRUE)$t
As far as I can see, I can't pass re.form to bootMer.
Does the alternative below result in simulated predictions conditional on random effects without passing use.u to bootMer?
FUN = function(.){
predict(., type="response", re.form=~(1|herd:bigherd) + (1|bigherd))
}
bootMer(fit2, FUN, nsim=10)$t

I'm not sure what's going on yet, but here are two workarounds that do work:
simulate(fit2, re.form=lme4:::reOnly(formula(fit2)))
simulate(fit2, re.form=~(1|herd:bigherd) + (1|bigherd))
There must be something going wrong with the expansion of the "slash" term, because this doesn't work:
simulate(fit2, re.form=~(1|bigherd/herd))
I've posted this as an lme4 issue
These workarounds don't work for bootMer (which only takes the use.u argument, not re.form) in the current CRAN release (1.1-9).
It is fixed in the development version on Github (1.1-10): devtools::install_github("lme4/lme4") will install it, if you have compilation tools installed.
In the meantime you could just go ahead and implement your own parametric bootstrap (for parametric bootstrapping, bootMer is actually a very thin wrapper around simulate()/[refit()orupdate()]/FUN`). Much of the complication has to do with parallel computation (you'd have to add some of it back in if you want parallel computation in your own PB implementation).
This is the outline of a hand-rolled parametric bootstrap:
nboot <- 10
nresp <- length(FUN(orig_fit))
res <- matrix(NA,nboot,nresp)
for (i in 1:nboot) {
res[i,] <- FUN(update(orig_fit,data=simulate(orig_fit,...)))
## or use refit() for LMMs
## ... are options to simulate()
}
t(apply(res,2,quantile,c(0.025,0.975)))

Related

Riding the wave Numerical schemes for hyperbolic PDEs, lorena barba lessons, assistance needed

I am a beginner python user who is trying to get a feel for computer science, I've been learning how to use it by studying concepts/subjects I'm already familiar with, such as Computation Fluid Mechanics & Finite Element Analysis. I got my degree in mechanical engineering, so not much CS background.
I'm studying a series by Lorena Barba on jupyter notebook viewer, Practical Numerical Methods, and i'm looking for some help, hopefully someone familiar with the subjects of CFD & FEA in general.
if you click on the link below and go to the following output line, you'll find what i have below. Really confused on this block of code operated within the function that is defined.
Anyway. If there is anyone out there, with any suggestions on how to tackle learning python, HELP
In[9]
rho_hist = [rho0.copy()]
rho = rho0.copy() **# im confused by the role of this variable here**
for n in range(nt):
# Compute the flux.
F = flux(rho, *args)
# Advance in time using Lax-Friedrichs scheme.
rho[1:-1] = (0.5 * (rho[:-2] + rho[2:]) -
dt / (2.0 * dx) * (F[2:] - F[:-2]))
# Set the value at the first location.
rho[0] = bc_values[0]
# Set the value at the last location.
rho[-1] = bc_values[1]
# Record the time-step solution.
rho_hist.append(rho.copy())
return rho_hist
http://nbviewer.jupyter.org/github/numerical-mooc/numerical-mooc/blob/master/lessons/03_wave/03_02_convectionSchemes.ipynb
The intent of the first two lines is to preserve rho0 and provide copies of it for the history (copy so that later changes in rho0 do not reflect back here) and as the initial value for the "working" variable rho that is used and modified during the computation.
The background is that python list and array variables are always references to the object in question. By assigning the variable you produce a copy of the reference, the address of the object, but not the object itself. Both variables refer to the same memory area. Thus not using .copy() will change rho0.
a = [1,2,3]
b = a
b[2] = 5
print a
#>>> [1, 2, 5]
Composite objects that themselves contain structured data objects will need a deepcopy to copy the data on all levels.
Numpy array values changed without being aksed?
how to pass a list as value and not as reference?

SSVS and spike-slab prior with JAGS

I'm very new to this topic/posting on a discussion board, so I apologize in advance if something is unclear.
I'm interested in performing a stochastic search variable seleciton (SSVS) in JAGS. I've seen codes online of people performing SSVS (e.g. http://www4.stat.ncsu.edu/~reich/ABA/code/SSVS which I've copied the code below) but my understanding is that to perform this method, I need to use a spike-slab prior in JAGS. The spike can be either a point mass or a distribution with a very narrow variance. Looking at most people's codes, there's only one distribution being defined (in the one above, they define a distribution on gamma, with beta = gamma * delta) and I believe they're assuming a point mass on the spike.
So my questions are:
1) Can someone explain why the code below is using the SSVS method? For example, how do we know this isn't using GVS, which is also another method that uses a Gibbs sampler?
2) Is this a point mass on the spike?
3) If I wanted to use simulated data to test whether the Gibbs sampler is correctly drawing from the spike/slab, how would I go about doing that? Would I code for a spike and a slab and what would I be looking for in the posterior to see that it is drawing correctly?
model_string <- "model{
# Likelihood
for(i in 1:n){
Y[i] ~ dpois(lambda[I])
log(lambda[i]) <- log(N[i]) + alpha + inprod(beta[],X[i,])
}
#Priors
for(j in 1:p){
gamma[j] ~ dnorm(0,tau)
delta[j] ~ dbern(prob)
beta[j] <- gamma[j]*delta[j]
}
prob ~ dunif(0,1)
tau ~ dgamma(.1,.1)
alpha ~ dnorm(0,0.1)
}"
I've also asked on the JAGS help page too: https://sourceforge.net/p/mcmc-jags/discussion/610036/thread/a44343e0/#ab47
I am also (trying to) work on some Bayesian varibale selection stuff in JAGs. I am by no means an expert on this topic, but maybe if we chat about this more we can learn together. Here is my interpretation of the varibale selection within this code:
model_string <- "model{
Likelihood
for(i in 1:n){
Y[i] ~ dpois(lambda[I])
log(lambda[i]) <- log(N[i]) + alpha + inprod(beta[],X[i,])
}
Priors
for(j in 1:p){
gamma[j] ~ dnorm(0,tau)
delta[j] ~ dbern(prob) # delta has a Bernoulli distributed prior (so it can only be 1:included or 0:notincluded)
beta[j] <- gamma[j]*delta[j] # delta is the inclusion probability
}
prob ~ dunif(0,1) # This is then setting an uninformative prior around the probability of a varible being included into the model
tau ~ dgamma(.1,.1)
alpha ~ dnorm(0,0.1)
}"
I have tried to comment out the variable selection sections of the model. The code above looks really similar to the Kuo & Mallick method of Bayesian variable selection. I am currently having trouble tuning the spike and slab method so the estimates mix properly instead of "getting stuck" on either 0 or 1.
So my priors are set up more like:
beta~ dnorm(0,tau)
tau <-(100*(1-gamma))+(0.001*(gamma)) # tau is the inclusion probability
gamma~dbern(0.5)
I have found this paper helps to explain the differences between different variable selection methods (It gets into GVS vs SSVS):
O’Hara, R.B. & Sillanpää, M.J. (2009). A review of Bayesian variable selection
methods: What, how and which. Bayesian Anal., 4, 85–118
Or this blog post: https://darrenjw.wordpress.com/2012/11/20/getting-started-with-bayesian-variable-selection-using-jags-and-rjags/
If there was no SSVS on the beta prior, the prior would look more like this:
Priors
for(j in 1:p){
beta[j] <- ~ dnorm(0,0.01) # just setting a normally (or whatever drstribution you're working in) distributed prior around beta.
}
tau ~ dgamma(.1,.1)
alpha ~ dnorm(0,0.1)
}"

Indirect Kalman Filter for Inertial Navigation System

I'm trying to implement an Inertial Navigation System using an Indirect Kalman Filter. I've found many publications and thesis on this topic, but not too much code as example. For my implementation I'm using the Master Thesis available at the following link:
https://fenix.tecnico.ulisboa.pt/downloadFile/395137332405/dissertacao.pdf
As reported at page 47, the measured values from inertial sensors equal the true values plus a series of other terms (bias, scale factors, ...).
For my question, let's consider only bias.
So:
Wmeas = Wtrue + BiasW (Gyro meas)
Ameas = Atrue + BiasA. (Accelerometer meas)
Therefore,
when I propagate the Mechanization equations (equations 3-29, 3-37 and 3-41)
I should use the "true" values, or better:
Wmeas - BiasW
Ameas - BiasA
where BiasW and BiasA are the last available estimation of the bias. Right?
Concerning the update phase of the EKF,
if the measurement equation is
dzV = VelGPS_est - VelGPS_meas
the H matrix should have an identity matrix in corrispondence of the velocity error state variables dx(VEL) and 0 elsewhere. Right?
Said that I'm not sure how I have to propagate the state variable after update phase.
The propagation of the state variable should be (in my opinion):
POSk|k = POSk|k-1 + dx(POS);
VELk|k = VELk|k-1 + dx(VEL);
...
But this didn't work. Therefore I've tried:
POSk|k = POSk|k-1 - dx(POS);
VELk|k = VELk|k-1 - dx(VEL);
that didn't work too... I tried both solutions, even if in my opinion the "+" should be used. But since both don't work (I have some other error elsewhere)
I would ask you if you have any suggestions.
You can see a snippet of code at the following link: http://pastebin.com/aGhKh2ck.
Thanks.
The difficulty you're running into is the difference between the theory and the practice. Taking your code from the snippet instead of the symbolic version in the question:
% Apply corrections
Pned = Pned + dx(1:3);
Vned = Vned + dx(4:6);
In theory when you use the Indirect form you are freely integrating the IMU (that process called the Mechanization in that paper) and occasionally running the IKF to update its correction. In theory the unchecked double integration of the accelerometer produces large (or for cheap MEMS IMUs, enormous) error values in Pned and Vned. That, in turn, causes the IKF to produce correspondingly large values of dx(1:6) as time evolves and the unchecked IMU integration runs farther and farther away from the truth. In theory you then sample your position at any time as Pned +/- dx(1:3) (the sign isn't important -- you can set that up either way). The important part here is that you are not modifying Pned from the IKF because both are running independent from each other and you add them together when you need the answer.
In practice you do not want to take the difference between two enourmous double values because you will lose precision (because many of the bits of the significand were needed to represent the enormous part instead of the precision you want). You have grasped that in practice you want to recursively update Pned on each update. However, when you diverge from the theory this way, you have to take the corresponding (and somewhat unobvious) step of zeroing out your correction value from the IKF state vector. In other words, after you do Pned = Pned + dx(1:3) you have "used" the correction, and you need to balance the equation with dx(1:3) = dx(1:3) - dx(1:3) (simplified: dx(1:3) = 0) so that you don't inadvertently integrate the correction over time.
Why does this work? Why doesn't it mess up the rest of the filter? As it turns out, the KF process covariance P does not actually depend on the state x. It depends on the update function and the process noise Q and so on. So the filter doesn't care what the data is. (Now that's a simplification, because often Q and R include rotation terms, and R might vary based on other state variables, etc, but in those cases you are actually using state from outside the filter (the cumulative position and orientation) not the raw correction values, which have no meaning by themselves).

Face coloring in Matlab revisited

Using Mathematica I was able to create the following plot
Now I would like to switch to Matlab - which I am just starting to learn. I was able to create the triangulation with FL.vertices and FL.faces matrix and the patch function, that looks like this
faces=FV.faces;
facecolor = [.7 .7 .7];
patch('faces',faces,'vertices',FV.vertices,...
'facecolor',facecolor,'facealpha',0.8,'edgecolor',[.8.8.8]);
camlight('headlight','infinite');
daspect([1 1 1]); axis vis3d; axis off
material dull;
It produces a dull image:
Now, I have a function J that takes the matrix FL.vertices and returns a matrix of positive values. I would like to color the faces according to the values of J on vertices. Possibly interpolate along the faces. Edges can be, for now, as they are - to deal with later. After reading the documentation it is not clear to me how to accomplish this task. Do I need to find min and max of J manually? Or can Matlab do it automatically? It is OK for now to use one of Matlab's preset coloring schemes, something like a "temperature map" would do. At which point should I call my function J? How exactly it should be used with the patch command? I looked through the previous answers to a similar question, but still I am not able to figure out how to deal with my case. Any helping suggestion will be appreciated.
P.S.
OK. I think I did it with simple
FV.Cdata=sphere_jacobian(FV.vertices,1,1,0,1);
figure
Hp = patch('faces',FV.faces,'vertices',FV.vertices,...
'FaceVertexCData',FV.Cdata,'facecolor','interp','edgecolor',[.8 .8 .8]);
But I am not sure if min and max have been automatically computed and interpolated.
Here is what I believe to be the answer given by the poster, I will put it here so the question does not remain open.
OK. I think I did it with simple
FV.Cdata=sphere_jacobian(FV.vertices,1,1,0,1);
figure
Hp = patch('faces',FV.faces,'vertices',FV.vertices,...
'FaceVertexCData',FV.Cdata,'facecolor','interp','edgecolor',[.8 .8 .8]);
But I am not sure if min and max have been automatically computed and interpolated.
I did
colormap(hsv(3200));
and normalized my function:
jac = sphere_jacobian(FV.vertices,m);
minj = min(jac);
maxj = max(jac);
jac1 = (jac-minj*ones(size(jac)))/(maxj-minj);FV.Cdata=jac1;
figure Hp = patch('faces',FV.faces,'vertices',FV.vertices,... 'FaceVertexCData',FV.Cdata,'facecolor','interp','edgecolor',[.8 .8 .8]);
The result can be seen here.

Generalization functions for Q-Learning

I have to do some work with Q Learning, about a guy that has to move furniture around a house (it's basically that). If the house is small enough, I can just have a matrix that represents actions/rewards, but as the house size grows bigger that will not be enough. So I have to use some kind of generalization function for it, instead. My teacher suggests I use not just one, but several ones, so I could compare them and so. What you guys recommend?
I heard that for this situation people are using Support Vector Machines, also Neural Networks. I'm not really inside the field so I can't tell. I had in the past some experience with Neural Networks, but SVM seem a lot harder subject to grasp. Are there any other methods that I should look for? I know there must be like a zillion of them, but I need something just to start.
Thanks
Just as a refresher of terminology, in Q-learning, you are trying to learn the Q-functions, which depend on the state and action:
Q(S,A) = ????
The standard version of Q-learning as taught in most classes tells you that you for each S and A, you need to learn a separate value in a table and tells you how to perform Bellman updates in order to converge to the optimal values.
Now, lets say that instead of table you use a different function approximator. For example, lets try linear functions. Take your (S,A) pair and think of a bunch of features you can extract from them. One example of a feature is "Am I next to a wall," another is "Will the action place the object next to a wall," etc. Number these features f1(S,A), f2(S,A), ...
Now, try to learn the Q function as a linear function of those features
Q(S,A) = w1 * f1(S,A) + w2*f2(S,A) ... + wN*fN(S,A)
How should you learn the weights w? Well, since this is a homework, I'll let you think about it on your own.
However, as a hint, lets say that you have K possible states and M possible actions in each state. Lets say you define K*M features, each of which is an indicator of whether you are in a particular state and are going to take a particular action. So
Q(S,A) = w11 * (S==1 && A == 1) + w12 * (S == 1 && A == 2) + w21 * (S==2 && A==3) ...
Now, notice that for any state/action pair, only one feature will be 1 and the rest will be 0, so Q(S,A) will be equal to the corresponding w and you are essentially learning a table. So, you can think of the standard, table Q-learning as a special case of learning with these linear functions. So, think of what the normal Q-learning algorithm does, and what you should do.
Hopefully you can find a small basis of features, much fewer than K*M, that will allow you to represent your space well.