Create regression tables with estout/esttab for interactions in Stata - regression

In one of my models I use the standard built-in notation for interaction terms in Stata, in another, I have to manually code this. In the end, I would like to present nice regression tables, using esttab. How can I show identical, but slightly different coded, interaction terms in the same row? Or imagine, it's actually another interaction, how can I force esttab to ignore that?
// interaction model 1
sysuse auto, clear
regress price weight c.mpg##c.mpg foreign
estimates store model1
// interaction model 2
gen int_mpg_mpg = mpg*mpg
regress price weight mpg int_mpg_mpg foreign
estimates store model2
// make nice regression table sidy-by-side
// manual label manual interactions
label variable int_mpg_mpg "Mileage (mpg) # Mileage (mpg)"
esttab model1 model2, label
// export to latex
label variable int_mpg_mpg "Mileage (mpg) $\times$ Mileage (mpg) "
esttab model1 model2 using "table.tex", ///
label nobaselevels beta not interaction(" $\times$ ") style(tex) replace
Output to console:
Output to LaTeX:
In both cases the manual variable label shows up as a name in regression tables. But identical variables names are not aligned in the same row. I am more interested in the solution for the LaTeX output, but the problem seems to be unrelated to LaTeX.

esttab won't be able to ignore something like that as the variables are unique in how they're specified. I would recommend doing all your interaction terms in the same way that works across both specifications such as interaction model 2.

For multiple different interaction terms, you can rename the interaction terms themselves before the regressions. For example, to estimate heterogenous treatment effects by different covariates, you could run:
foreach var of varlist age education {
cap drop interaction
gen interaction = `var'
reg outcome i.treatment##c.interaction
est store `var'
}
In an esttab or estout there will be one row for the interaction effect, and one row for the main effect. This is a bit of a crude workaround, but normally does the job.

The issue should be addressed on the level "how Stata names the equations and coefficients across estimators". I adapted the code from Andrew:
https://www.statalist.org/forums/forum/general-stata-discussion/general/1551586-align-nls-and-mle-estimates-for-the-same-variable-in-the-same-row-esttab
He is using Ben Jann's program erepost from SSC (ssc install erepost).
* model 1
sysuse auto, clear
eststo clear
gen const=1
qui regress price weight c.mpg##c.mpg foreign
mat b=e(b)
* store estimates
eststo model1
* model 2
gen int_mpg_mpg = mpg*mpg // generate interaction manually
qui regress price weight mpg int_mpg_mpg foreign
* rename interaction with additional package erepost
local coln "b:weight b:mpg b:c.mpg#c.mpg b:foreign b:_cons"
mat colnames b= `coln'
capt prog drop replace_b
program replace_b, eclass
erepost b= b, rename
end
replace_b
eststo model2
esttab model1 model2, mtitle("Interaction V1" "Interaction V2")
Now, all interactions (automatic and manual) are aligned:
--------------------------------------------
(1) (2)
Interactio~1 Interactio~2
--------------------------------------------
main
weight 3.038*** 3.038***
(3.84) (3.84)
mpg -298.1 -298.1
(-0.82) (-0.82)
c.mpg#c.mpg 5.862 5.862
(0.90) (0.90)
foreign 3420.2*** 3420.2***
(4.62) (4.62)
_cons -527.0 -527.0
(-0.08) (-0.08)
--------------------------------------------
N 74 74
--------------------------------------------

Related

Pytorch : different behaviours in GAN training with different, but conceptually equivalent, code

I'm trying to implement a simple GAN in Pytorch. The following training code works:
for epoch in range(max_epochs): # loop over the dataset multiple times
print(f'epoch: {epoch}')
running_loss = 0.0
for batch_idx,(data,_) in enumerate(data_gen_fn):
# data preparation
real_data = data
input_shape = real_data.shape
inputs_generator = torch.randn(*input_shape).detach()
# generator forward
fake_data = generator(inputs_generator).detach()
# discriminator forward
optimizer_generator.zero_grad()
optimizer_discriminator.zero_grad()
#################### ALERT CODE #######################
predictions_on_real = discriminator(real_data)
predictions_on_fake = discriminator(fake_data)
predictions = torch.cat((predictions_on_real,
predictions_on_fake), dim=0)
#########################################################
# loss discriminator
labels_real_fake = torch.tensor([1]*batch_size + [0]*batch_size)
loss_discriminator_batch = criterion_discriminator(predictions,
labels_real_fake)
# update discriminator
loss_discriminator_batch.backward()
optimizer_discriminator.step()
# generator
# zero the parameter gradients
optimizer_discriminator.zero_grad()
optimizer_generator.zero_grad()
fake_data = generator(inputs_generator) # make again fake data but without detaching
predictions_on_fake = discriminator(fake_data) # D(G(encoding))
# loss generator
labels_fake = torch.tensor([1]*batch_size)
loss_generator_batch = criterion_generator(predictions_on_fake,
labels_fake)
loss_generator_batch.backward() # dL(D(G(encoding)))/dW_{G,D}
optimizer_generator.step()
If I plot the generated images for each iteration, I see that the generated images look like the real ones, so the training procedure seems to work well.
However, if I try to change the code in the ALERT CODE part , i.e., instead of:
#################### ALERT CODE #######################
predictions_on_real = discriminator(real_data)
predictions_on_fake = discriminator(fake_data)
predictions = torch.cat((predictions_on_real,
predictions_on_fake), dim=0)
#########################################################
I use the following:
#################### ALERT CODE #######################
predictions = discriminator(torch.cat( (real_data, fake_data), dim=0))
#######################################################
That is conceptually the same (in a nutshell, instead of doing two different forward on the discriminator, the former on the real, the latter on the fake data, and finally concatenate the results, with the new code I first concatenate real and fake data, and finally I make just one forward pass on the concatenated data.
However, this code version does not work, that is the generated images seems to be always random noise.
Any explanation to this behavior?
Why do we different results?
Supplying inputs in either the same batch, or separate batches, can make a difference if the model includes dependencies between different elements of the batch. By far the most common source in current deep learning models is batch normalization. As you mentioned, the discriminator does include batchnorm, so this is likely the reason for different behaviors. Here is an example. Using single numbers and a batch size of 4:
features = [1., 2., 5., 6.]
print("mean {}, std {}".format(np.mean(features), np.std(features)))
print("normalized features", (features - np.mean(features)) / np.std(features))
>>>mean 3.5, std 2.0615528128088303
>>>normalized features [-1.21267813 -0.72760688 0.72760688 1.21267813]
Now we split the batch into two parts. First part:
features = [1., 2.]
print("mean {}, std {}".format(np.mean(features), np.std(features)))
print("normalized features", (features - np.mean(features)) / np.std(features))
>>>mean 1.5, std 0.5
>>>normalized features [-1. 1.]
Second part:
features = [5., 6.]
print("mean {}, std {}".format(np.mean(features), np.std(features)))
print("normalized features", (features - np.mean(features)) / np.std(features))
>>>mean 5.5, std 0.5
>>>normalized features [-1. 1.]
As we can see, in the split-batch version, the two batches are normalized to the exact same numbers, even though the inputs are very different. In the joint-batch version, on the other hand, the larger numbers are still larger than the smaller ones as they are normalized using the same statistics.
Why does this matter?
With deep learning, it's always hard to say, and especially with GANs and their complex training dynamics. A possible explanation is that, as we can see in the example above, the separate batches result in more similar features after normalization even if the original inputs are quite different. This may help early in training, as the generator tends to output "garbage" which has very different statistics from real data.
With a joint batch, these differing statistics make it easy for the discriminator to tell the real and generated data apart, and we end up in a situation where the discriminator "overpowers" the generator.
By using separate batches, however, the different normalizations result in the generated and real data to look more similar, which makes the task less trivial for the discriminator and allows the generator to learn.

How to plot a transfer function from a Cauer network

The picture below shows a Cauer network, which is a continued fraction network.
I have built the 3rd olrder transfer function 3rd Octave like this:
function uebertragung=G(R1,Tau1,R2,Tau2,R3,Tau3)
s= tf("s");
C1= Tau1/R1;
C2= Tau2/R2;
C3= Tau3/R3;
# --- Uebertragungsfunktion 3.Ordnung --- #
uebertragung= 1/((s*R1*C1)^3+5*(s*R2*C2)^2+6*s*R3*C3+1);
endfunction
R1,R2,R3,C1,C2,C3 are the 6 parameters my characteristic curve depends on.
I need to put this parameters into the tranfser function, get a result and plot the characteristic curve from the data.
The characteristic curve shows thermal impedance vs time. Like these 2 curves from an igbt data sheet.
My problem is I don't know how to handle transfer functions properly. I need data to plot the characteristic curve but I don't know how to generate them out of the transfer function.
Any tips are welcome. Do I have to make Laplace transformation?
If you need further Information ask me and I try to provide them all.
From the data sheet, the equation they are using for their transient thermal impedance graph is the Foster chain step function response:
Z(t) = sum (R_i * (1-exp(-t/tau_i))) = sum (R_i * (1-exp(-t/(R_i*C_i))))
I verified that the stage R's and C's in the table by the graph will produce the plot you shared with that function.
The method for producing a step function response of an s-domain (Laplace domain) impedance function (Z) is to take the inverse Laplace transform of the product of the transfer function and 1/s (the Laplace domain form of a constant value step function). With the Foster model impedance function:
Z(s) = sum (R_i/(1+R_i*C_i*s))
that will produce the equation above.
Using the transfer function in Octave, you can use the Control package function step to calculate the transient response for you rather than performing the inverse Laplace transform yourself. So once you have Z(s), step(Z) will produce or plot the transient response. See help step for details. You can then adjust the plot (switch to log scale, set axes limits, etc) to look like one of the spec sheet plots.
Now, you want to do the same thing with a Cauer network model. It is important to realize that the R's and C's will not be the same for the two models. The Foster network is a decoupled model that has each primary complex pole isolated by layout, but the R's and C's are actually convolutions of the physical thermal resistances and capacitances in the real package. On the contrary, the Cauer model has R's and C's that match the physical package layers, and the poles in the s-domain transfer function will be complex products of the multiple layers.
So, however you are obtaining your R's and C's for the Cauer model, you can't just use the same values they have in their Foster model parameter table. They can be calculated from physical layer and material properties, however, assuming you have that information. Once you do have useful values, the procedure for going from Z(s) to the transient impedance function is the same for either network, and they should produce the same result.
As an example, the following procedure should work in both Octave and Matlab to plot the Thermal impedance curve from the spec sheet data using the Foster Z(s) model as a starting point. For the Cauer model, just use a different Z(s) function.
(Note that Octave has some issues in the step function that insert t = 0 entries into the time series output, even when they aren't specified, which can cause some errors when trying to plot on a log scale. so this example puts in a t=0 node then ignores it. wanted to explain so that line didn't seem confusing).
s = tf('s')
R1 = 8.5e-3; R2 = 2e-3;
tau1 = 151e-3; tau2 = 5.84e-3;
C1 = tau1/R1; C2 = tau2/R2;
input_imped = R1/(1+R1*C1*s)+R2/(1+R2*C2*s)
times = linspace(0, 10, 100000);
[Zvals,output_times] = step(input_imped, times);
loglog(output_times(2:end), Zvals(2:end));
xlim([.001 10]); ylim([0.0001, .1]);
grid;
xlabel('t [s]');
ylabel('Z_t_h_(_j_-_c_) [K/W] IGBT');
text(1,0.013 ,'Z_t_h_(_j_-_c_) IGBT');

p values for random effects in lmer

I am working on a mixed model using lmer function. I want to obtain p-values for all the fixed and random effects. I am able to obtain p-values for fixed effects using different methods but I haven't found anything for random effects. Whatever method I have found on the internet is to make a null model for the same and then get the p-values by comparison. Can I have a method through which I don't need to make an another model?
My model looks like:
mod1 = lmer(Out ~ Var1 + (1 + Var2 | Var3), data = dataset)
You must do these through model comparison, as far as I know. The lmerTest package has a function called step, which will reduce your model to just the significant parameters (fixed and random) based on a number of different tests. The documentation isn't entirely clear on how everything is done, so I much prefer to use model comparison to get at specific tests.
For your model, you could test the random slope by specifying:
mod0 <- lmer(Out ~ Var1 + (1 + Var2 | Var3), data = dataset, REML=TRUE)
mod1 <- lmer(Out ~ Var1 + (1 | Var3), data = dataset, REML=TRUE)
anova(mod0, mod1, refit=FALSE)
This will show you the log likelihood test and test statistic (chi-square distributed). But you are testing two parameters here: the random slope of Var2 and the covariance between the random slopes and random intercepts. So you need a p-value adjustment:
1-(.5*pchisq(anova(mod0,mod1, refit=FALSE)$Chisq[[2]],df=2)+
.5*pchisq(anova(mod0,mod1, refit=FALSE)$Chisq[[2]],df=1))
More on those tests here or here.

How to batch rename variables in esttab

I am using the community-contributed command esttab with the rename() option.
I have a special situation in which I run multiple regressions where each regression has a coefficient that is from a different (similarly-named) variable, but each corresponds to the same idea.
Here is a (very contrived) toy example:
sysuse auto, clear
rename weight mpg1
rename mpg mpg2
rename turn mpg3
I want to display the results of three regressions, but have only one line for mpg1, mpg2, and mpg3 (instead of each one appearing on a separate line).
One way to accomplish this is to do the following:
eststo clear
eststo: quietly reg price mpg1
eststo: quietly reg price mpg2
eststo: quietly reg price mpg3
esttab, rename(mpg1 mpg mpg2 mpg mpg3 mpg)
Can I rename all of the variables at the same time by doing something such as rename(mpg* mpg)?
If I want to run a large number of regressions, it becomes more advantageous to do this instead of writing them all out by hand.
Stata's rename group command can handle abbreviations and wildcards, unlike the rename() option of estout. However, for the latter, you need to build a list of names and store it in a local macro.
Below you can find an improved version of your toy example code:
sysuse auto, clear
eststo clear
rename (weight mpg turn) mpg#, addnumber
forvalues i = 1 / 3 {
eststo: quietly reg price mpg`i'
local mpglist `mpglist' mpg`i' mpg
}
esttab, rename(`mpglist')
------------------------------------------------------------
(1) (2) (3)
price price price
------------------------------------------------------------
mpg 2.044*** -238.9*** 207.6**
(5.42) (-4.50) (2.76)
_cons -6.707 11253.1*** -2065.0
(-0.01) (9.61) (-0.69)
------------------------------------------------------------
N 74 74 74
------------------------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

Remove outliers with large standardized residuals in Stata

I run a simple regression in Stata for two subsamples and afterwards I want to exclude all observations with standardized residuals larger than 3.0. I tried:
regress y x if subsample_criteria==1
gen st_res1=e(rsta)
regress y x if subsample_criteria==0
gen st_res2=e(rsta)
drop if st_res1 | st_res2 > 3.0
However, the new variable is full of missing values and the values for the stand. residuals are not stored in the variables st_res1 and st_res2.
I am grateful for any hints!
The problem with your code is that Stata does not know what e(rsta) is (and neither do I), so it creates a missing, which Stata thinks of as very large positive number. All missings are greater than 3, so your constraint does not bind.
Ignoring the statistical merits of doing this, here's one way:
sysuse auto, clear
reg price mpg
predict ehat, rstandard
reg price mpg if abs(ehat)<3
Note that I am using the absolute value of the residual, which I think makes more sense here.
First, providing a MCVE is always a good first step (and fairly easy given Stata's sysuse and webuse commands). Now, on to the question.
See help regress postestimation and help predict for the proper syntax for generating new variables with residuals, etc. The syntax is a bit different from the gen command, as you will see below.
Note also that your drop if condition is improperly formatted, and right now is interpreted as drop if st_res1 != 0 | st_res2 > 3.0. (I also assume you want to drop standardized residuals < -3.0, but if this is incorrect, you can remove the abs() function.)
sysuse auto , clear
replace mpg = 10000 in 1/2
replace mpg = 0.0001 in 70
reg mpg weight if foreign
predict rst_for , rstandard
reg mpg weight if !foreign
predict rst_dom , rstandard
drop if abs(rst_for) > 3.0 | abs(rst_dom) > 3.0
Postscript: Note that you may also consider adding if e(sample) to your predict commands, depending on whether you wish to extrapolate the results of the subsample regression to the entire sample and evaluate all residuals, or whether you only wish to drop observations based on in-sample standardized residuals.