How to batch rename variables in esttab - regression

I am using the community-contributed command esttab with the rename() option.
I have a special situation in which I run multiple regressions where each regression has a coefficient that is from a different (similarly-named) variable, but each corresponds to the same idea.
Here is a (very contrived) toy example:
sysuse auto, clear
rename weight mpg1
rename mpg mpg2
rename turn mpg3
I want to display the results of three regressions, but have only one line for mpg1, mpg2, and mpg3 (instead of each one appearing on a separate line).
One way to accomplish this is to do the following:
eststo clear
eststo: quietly reg price mpg1
eststo: quietly reg price mpg2
eststo: quietly reg price mpg3
esttab, rename(mpg1 mpg mpg2 mpg mpg3 mpg)
Can I rename all of the variables at the same time by doing something such as rename(mpg* mpg)?
If I want to run a large number of regressions, it becomes more advantageous to do this instead of writing them all out by hand.

Stata's rename group command can handle abbreviations and wildcards, unlike the rename() option of estout. However, for the latter, you need to build a list of names and store it in a local macro.
Below you can find an improved version of your toy example code:
sysuse auto, clear
eststo clear
rename (weight mpg turn) mpg#, addnumber
forvalues i = 1 / 3 {
eststo: quietly reg price mpg`i'
local mpglist `mpglist' mpg`i' mpg
}
esttab, rename(`mpglist')
------------------------------------------------------------
(1) (2) (3)
price price price
------------------------------------------------------------
mpg 2.044*** -238.9*** 207.6**
(5.42) (-4.50) (2.76)
_cons -6.707 11253.1*** -2065.0
(-0.01) (9.61) (-0.69)
------------------------------------------------------------
N 74 74 74
------------------------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

Related

Create regression tables with estout/esttab for interactions in Stata

In one of my models I use the standard built-in notation for interaction terms in Stata, in another, I have to manually code this. In the end, I would like to present nice regression tables, using esttab. How can I show identical, but slightly different coded, interaction terms in the same row? Or imagine, it's actually another interaction, how can I force esttab to ignore that?
// interaction model 1
sysuse auto, clear
regress price weight c.mpg##c.mpg foreign
estimates store model1
// interaction model 2
gen int_mpg_mpg = mpg*mpg
regress price weight mpg int_mpg_mpg foreign
estimates store model2
// make nice regression table sidy-by-side
// manual label manual interactions
label variable int_mpg_mpg "Mileage (mpg) # Mileage (mpg)"
esttab model1 model2, label
// export to latex
label variable int_mpg_mpg "Mileage (mpg) $\times$ Mileage (mpg) "
esttab model1 model2 using "table.tex", ///
label nobaselevels beta not interaction(" $\times$ ") style(tex) replace
Output to console:
Output to LaTeX:
In both cases the manual variable label shows up as a name in regression tables. But identical variables names are not aligned in the same row. I am more interested in the solution for the LaTeX output, but the problem seems to be unrelated to LaTeX.
esttab won't be able to ignore something like that as the variables are unique in how they're specified. I would recommend doing all your interaction terms in the same way that works across both specifications such as interaction model 2.
For multiple different interaction terms, you can rename the interaction terms themselves before the regressions. For example, to estimate heterogenous treatment effects by different covariates, you could run:
foreach var of varlist age education {
cap drop interaction
gen interaction = `var'
reg outcome i.treatment##c.interaction
est store `var'
}
In an esttab or estout there will be one row for the interaction effect, and one row for the main effect. This is a bit of a crude workaround, but normally does the job.
The issue should be addressed on the level "how Stata names the equations and coefficients across estimators". I adapted the code from Andrew:
https://www.statalist.org/forums/forum/general-stata-discussion/general/1551586-align-nls-and-mle-estimates-for-the-same-variable-in-the-same-row-esttab
He is using Ben Jann's program erepost from SSC (ssc install erepost).
* model 1
sysuse auto, clear
eststo clear
gen const=1
qui regress price weight c.mpg##c.mpg foreign
mat b=e(b)
* store estimates
eststo model1
* model 2
gen int_mpg_mpg = mpg*mpg // generate interaction manually
qui regress price weight mpg int_mpg_mpg foreign
* rename interaction with additional package erepost
local coln "b:weight b:mpg b:c.mpg#c.mpg b:foreign b:_cons"
mat colnames b= `coln'
capt prog drop replace_b
program replace_b, eclass
erepost b= b, rename
end
replace_b
eststo model2
esttab model1 model2, mtitle("Interaction V1" "Interaction V2")
Now, all interactions (automatic and manual) are aligned:
--------------------------------------------
(1) (2)
Interactio~1 Interactio~2
--------------------------------------------
main
weight 3.038*** 3.038***
(3.84) (3.84)
mpg -298.1 -298.1
(-0.82) (-0.82)
c.mpg#c.mpg 5.862 5.862
(0.90) (0.90)
foreign 3420.2*** 3420.2***
(4.62) (4.62)
_cons -527.0 -527.0
(-0.08) (-0.08)
--------------------------------------------
N 74 74
--------------------------------------------

How to use esttab to create columns with different cells

Suppose that I have this data:
sysuse auto2, clear
For two different samples, I can use the community-contributed command esttab to create a table of means a and b with standard deviations in parentheses below the means:
eststo clear
eststo a: estpost summarize trunk weight length turn
keep if inrange(mpg, 12, 20)
eststo b: estpost summarize trunk weight length turn
esttab a b, label cells("mean(fmt(2))" "sd(fmt(2) par)") ///
nonumbers booktabs collabels("a" "b")
I want the produced table to have the two columns above exactly as is, but then to add additional summary statistics (here min and max) corresponding to the b estimates.
For example, I want the third column to be like:
esttab b, label cells("min") ///
nonumbers booktabs collabels("min")
In addition, I would like the fourth column to be as follows:
esttab b, label cells("max") ///
nonumbers booktabs collabels("min")
The problem is that I am not sure how to make all of this be in one table together (other than perhaps saving everything to a matrix and using esttab on that).
The reason is that it does not seem like one can get the cells option to correspond to an individual column; it applies the changes to all columns.
Note that if there is a way to do this, but it would require the s.d.s to not be included, that is fine.
How can I generate the desired output without creating a matrix?
This is the best you can do without creating a matrix:
esttab a b, label cells( (mean(fmt(2)) min max) sd(fmt(2) par) )
--------------------------------------------------------------------------------------------------
(1) (2)
mean/sd min max mean/sd min max
--------------------------------------------------------------------------------------------------
Trunk space (.. ft.) 13.76 5.00 23.00 16.32 7.00 23.00
(4.28) (3.28)
Weight (lbs.) 3019.46 1760.00 4840.00 3558.68 2410.00 4840.00
(777.19) (498.89)
Length (in.) 187.93 142.00 233.00 203.89 173.00 233.00
(22.27) (13.87)
Turn Circle (ft.) 39.65 31.00 51.00 42.55 36.00 51.00
(4.40) (3.21)
--------------------------------------------------------------------------------------------------
Observations 74 38
--------------------------------------------------------------------------------------------------

Remove outliers with large standardized residuals in Stata

I run a simple regression in Stata for two subsamples and afterwards I want to exclude all observations with standardized residuals larger than 3.0. I tried:
regress y x if subsample_criteria==1
gen st_res1=e(rsta)
regress y x if subsample_criteria==0
gen st_res2=e(rsta)
drop if st_res1 | st_res2 > 3.0
However, the new variable is full of missing values and the values for the stand. residuals are not stored in the variables st_res1 and st_res2.
I am grateful for any hints!
The problem with your code is that Stata does not know what e(rsta) is (and neither do I), so it creates a missing, which Stata thinks of as very large positive number. All missings are greater than 3, so your constraint does not bind.
Ignoring the statistical merits of doing this, here's one way:
sysuse auto, clear
reg price mpg
predict ehat, rstandard
reg price mpg if abs(ehat)<3
Note that I am using the absolute value of the residual, which I think makes more sense here.
First, providing a MCVE is always a good first step (and fairly easy given Stata's sysuse and webuse commands). Now, on to the question.
See help regress postestimation and help predict for the proper syntax for generating new variables with residuals, etc. The syntax is a bit different from the gen command, as you will see below.
Note also that your drop if condition is improperly formatted, and right now is interpreted as drop if st_res1 != 0 | st_res2 > 3.0. (I also assume you want to drop standardized residuals < -3.0, but if this is incorrect, you can remove the abs() function.)
sysuse auto , clear
replace mpg = 10000 in 1/2
replace mpg = 0.0001 in 70
reg mpg weight if foreign
predict rst_for , rstandard
reg mpg weight if !foreign
predict rst_dom , rstandard
drop if abs(rst_for) > 3.0 | abs(rst_dom) > 3.0
Postscript: Note that you may also consider adding if e(sample) to your predict commands, depending on whether you wish to extrapolate the results of the subsample regression to the entire sample and evaluate all residuals, or whether you only wish to drop observations based on in-sample standardized residuals.

How to use predict with stored e(b) from old regression

I know that one can get predicted values as follows:
reg y x1 x2 x3
predict pred_values
Let's say that I run a regression and store the values:
reg y x1 x2
matrix stored_b = e(b)
And then I run another regression (doesn't matter what).
Is it possible to use the predict command using stored_b instead of the current e(b)?
(Of course, I could generate the predicted values by manually computing them based on stored_b, but this could get tedious if there are many coefficients.)
There's no need to create a matrix. Stata has commands that facilitate the task. Try estimates store and estimates restore. An example:
clear
set more off
sysuse auto
// initial regression/predictions
regress price weight
estimates store myest
predict double resid, residuals
// second regression/prediction
regress price mpg
predict double residdiff, residuals
// backup and predict from initial regression results
estimates restore myest
predict double resid2, residuals
// should pass
assert resid == resid2
// should fail
assert resid == residdiff

Recording marginal effects in Stata instead of coefficients in a regression table

I need to save the marginal effects of the below models in a table using estout or outreg. The commands i use below only save the coefficients in the table and not the marginal effect. I have been trying a lot and nothing is working
sysuse auto
reg price mpg rep78 foreign, robust
margins, dydx(*)
estimates store m1, title(Model 1)
tobit price mpg rep78 foreign, 11(0)
margins, dydx(*) predict (ystar(0,.) )
estimates store m2, title(Model 2)
probit price mpg rep78 foreign
margins, dydx(*)
estimates store m3, title(Model 3)
truncreg price mpg rep78 foreign
margins, dydx(*) predict(e(0,.))
estimates store m4, title(Model 4)
estout m1 m2 m3 m4 , cells(b(star fmt(3)) se(par fmt(2)))
I give an example showing what you ask for. However, beware that
in the linear regression model, the marginal effect equals the relevant slope coefficient (https://www3.nd.edu/~rwilliam/stats/Margins01.pdf)
So you might be getting correct results. (I'm not able to run your code without bumping to an error not related to your original query.)
The example contains a linear and a non-linear model to emphasize the last point:
clear all
set more off
*----- example data -----
*from http://repec.org/bocode/e/estout/advanced.html
sysuse auto
generate reprec = (rep78 > 3) if rep78 < .
*----- what you want -----
eststo clear
regress foreign mpg reprec
margins, dydx(*) post
eststo modreg
logit foreign mpg reprec
margins, dydx(*) post
eststo modlog
esttab, se mtitles title(Marginal effects)