Rounding results in OLS regression - output

This is my regression and its output:
How can I round the coefficients and the standard errors in two decimal places?

You can use the option cformat().
For example::
sysuse auto
regress price length, cformat(%5.2f)
Source | SS df MS Number of obs = 74
-------------+---------------------------------- F(1, 72) = 16.50
Model | 118425867 1 118425867 Prob > F = 0.0001
Residual | 516639529 72 7175549.01 R-squared = 0.1865
-------------+---------------------------------- Adj R-squared = 0.1752
Total | 635065396 73 8699525.97 Root MSE = 2678.7
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
length | 57.20 14.08 4.06 0.000 29.13 85.27
_cons | -4584.90 2664.44 -1.72 0.090 -9896.36 726.56
------------------------------------------------------------------------------
In the Stata manual for regress there is a useful discussion of display options.

Related

Regression with all variables without explicitly declaring them

I have a dataset that I would like to run a regression on in Stata. I want to make one of the dummy variables the base so I use the ib1.month1 in the regress command.
Is it possible to include in my regression all other variables in the dataset without explicitly writing out each variable again?
You can use the ds command:
sysuse auto, clear
drop make
ds price foreign, not
regress price ib1.foreign `r(varlist)'
Source | SS df MS Number of obs = 69
-------------+---------------------------------- F(10, 58) = 8.66
Model | 345416162 10 34541616.2 Prob > F = 0.0000
Residual | 231380797 58 3989324.09 R-squared = 0.5989
-------------+---------------------------------- Adj R-squared = 0.5297
Total | 576796959 68 8482308.22 Root MSE = 1997.3
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
foreign |
Domestic | -3334.848 957.2253 -3.48 0.001 -5250.943 -1418.754
mpg | -21.80518 77.3599 -0.28 0.779 -176.6578 133.0475
rep78 | 184.7935 331.7921 0.56 0.580 -479.3606 848.9476
headroom | -635.4921 383.0243 -1.66 0.102 -1402.198 131.2142
trunk | 71.49929 95.05012 0.75 0.455 -118.7642 261.7628
weight | 4.521161 1.411926 3.20 0.002 1.694884 7.347438
length | -76.49101 40.40303 -1.89 0.063 -157.3665 4.38444
turn | -114.2777 123.5374 -0.93 0.359 -361.5646 133.0092
displacement | 11.54012 8.378315 1.38 0.174 -5.230896 28.31115
gear_ratio | -318.6479 1124.34 -0.28 0.778 -2569.259 1931.964
_cons | 13124.34 6726.3 1.95 0.056 -339.8103 26588.5
------------------------------------------------------------------------------

Unit-specific Trends and R-squared near 1

I am currently working on a country panel dataset in which I am running a Dif-in-Dif regression including unit specific trends in Stata
My main concern is that the adjusted R-squared obtained is really high, sometimes even 0.99. I am assuming this is a sign of some kind of mistake but I do not know how to correct it.
For the model, I have near 5000 observations. The number of countries are 201, I have 36 years and 5 control variables, then the number of parameters would be around 450.
Here I attach the code used:
xtset id_num year // id_num = id_country
reg `outcome' i.treatment i.year i.id_num c.year#i.id_num `controls' if id_country!="USA" & `subgroup'==1, cluster(id_num)
In case is useful, this is the first part of the output
note: 201.id_num#c.year omitted because of collinearity
Linear regression Number of obs = 4,789
F(39, 174) = .
Prob > F = .
R-squared = 0.9994
Root MSE = .20753
(Std. Err. adjusted for 175 clusters in id_country)
-------------------------------------------------------------------------------
| Robust
obesity_as | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
1.treatment | .1847802 .1341994 1.38 0.170 -.080088 .4496483
|
year |
1981 | .2162895 .0156983 13.78 0.000 .185306 .2472731
1982 | .4461132 .0224864 19.84 0.000 .4017319 .4904944
1983 | .6690157 .0281392 23.78 0.000 .6134777 .7245538
1984 | .915047 .0311529 29.37 0.000 .8535609 .9765332
1985 | 1.177176 .0344991 34.12 0.000 1.109085 1.245266
1986 | 1.421679 .0389734 36.48 0.000 1.344758 1.498601
1987 | 1.68354 .0413294 40.73 0.000 1.601969 1.765112
1988 | 1.963494 .0440206 44.60 0.000 1.876611 2.050377
1989 | 2.236331 .0472635 47.32 0.000 2.143048 2.329615
1990 | 2.52923 .0498206 50.77 0.000 2.4309 2.62756

How can I specify the base level of a factor variable?

I have data for 2000-2016 and I am trying to estimate the following regression:
xtset id
xtreg lnp i.year i.year#fp, fe vce(robust)
However, when I do this, Stata omits 2008 because of collinearity.
Is there a way to specify which year is omitted?
More generally, you can specify the omitted level of a factor variable (i.e. the
base) by using the ib operator (see also help fvvarlist).
Below is a reproducible example using Stata's toy dataset nlswork:
webuse nlswork, clear
xtset idcode
Using 77 as the base year:
xtreg ln_wage ib77.year age, fe vce(robust)
Fixed-effects (within) regression Number of obs = 28,510
Group variable: idcode Number of groups = 4,710
R-sq: Obs per group:
within = 0.1060 min = 1
between = 0.0914 avg = 6.1
overall = 0.0805 max = 15
F(15,4709) = 69.49
corr(u_i, Xb) = 0.0467 Prob > F = 0.0000
(Std. Err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
| Robust
ln_wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
year |
68 | -.108365 .1111117 -0.98 0.329 -.3261959 .1094659
69 | -.0335029 .0995142 -0.34 0.736 -.2285973 .1615915
70 | -.0604953 .0867605 -0.70 0.486 -.2305866 .1095959
71 | -.0218073 .0742761 -0.29 0.769 -.1674232 .1238087
72 | -.0226893 .0622792 -0.36 0.716 -.1447857 .0994071
73 | -.0203581 .049851 -0.41 0.683 -.1180894 .0773732
75 | -.0305043 .0259707 -1.17 0.240 -.081419 .0204104
78 | .0225868 .0147272 1.53 0.125 -.0062854 .0514591
80 | .0058999 .0381391 0.15 0.877 -.0688706 .0806704
82 | .0006801 .0622403 0.01 0.991 -.1213399 .1227001
83 | .0127622 .074435 0.17 0.864 -.1331653 .1586897
85 | .0381987 .0989316 0.39 0.699 -.1557535 .2321508
87 | .0298993 .1237839 0.24 0.809 -.2127751 .2725736
88 | .0716091 .1397635 0.51 0.608 -.2023927 .345611
|
age | .0125992 .0123091 1.02 0.306 -.0115323 .0367308
_cons | 1.312096 .3453967 3.80 0.000 .6349571 1.989235
-------------+----------------------------------------------------------------
sigma_u | .4058746
sigma_e | .30300411
rho | .64212421 (fraction of variance due to u_i)
------------------------------------------------------------------------------
Using 80 as the base year:
xtreg ln_wage ib80.year age, fe vce(robust)
Fixed-effects (within) regression Number of obs = 28,510
Group variable: idcode Number of groups = 4,710
R-sq: Obs per group:
within = 0.1060 min = 1
between = 0.0914 avg = 6.1
overall = 0.0805 max = 15
F(15,4709) = 69.49
corr(u_i, Xb) = 0.0467 Prob > F = 0.0000
(Std. Err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
| Robust
ln_wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
year |
68 | -.1142649 .1480678 -0.77 0.440 -.4045471 .1760172
69 | -.0394028 .136462 -0.29 0.773 -.3069323 .2281266
70 | -.0663953 .1237179 -0.54 0.592 -.3089402 .1761497
71 | -.0277072 .1112026 -0.25 0.803 -.2457164 .190302
72 | -.0285892 .0991208 -0.29 0.773 -.2229124 .165734
73 | -.026258 .0866489 -0.30 0.762 -.1961303 .1436142
75 | -.0364042 .0625743 -0.58 0.561 -.1590791 .0862706
77 | -.0058999 .0381391 -0.15 0.877 -.0806704 .0688706
78 | .0166869 .0258678 0.65 0.519 -.0340261 .0673999
82 | -.0052198 .0257713 -0.20 0.840 -.0557437 .0453041
83 | .0068623 .0378166 0.18 0.856 -.0672759 .0810005
85 | .0322987 .0620538 0.52 0.603 -.0893558 .1539533
87 | .0239993 .0868397 0.28 0.782 -.1462471 .1942457
88 | .0657092 .1028815 0.64 0.523 -.1359868 .2674052
|
age | .0125992 .0123091 1.02 0.306 -.0115323 .0367308
_cons | 1.317996 .3824809 3.45 0.001 .5681546 2.067838
-------------+----------------------------------------------------------------
sigma_u | .4058746
sigma_e | .30300411
rho | .64212421 (fraction of variance due to u_i)
------------------------------------------------------------------------------

Essential Prime Implicants and Minterm Expressions

I have an exam for a university course shortly, and upon reviewing one of my assignments I have come to realize that I don't understand why I have lost marks/how to do a couple of questions. Hopefully someone can shed some light on the subject for me! The questions were as follows:
Use K-Maps to simplify the following boolean functions (Note that d() represents a don't care minterm):
1.) F(w, x, y, z) = ∑(1,3,5,7,11,12,13,15)
My answer:
Prime Implicants: yz, w'z, xz, wxy'
Essential Prime Implicants: yz, w'z, wxy'
Possible Minimal Expression(s): yz + w'z + wxy'
Answer sheet (professor's answer):
Prime Implicants: yz, w'z, xz, wxy'
Essential Prime Implicants: Same as prime implicants
Possible Minimal Expression(s): yz + w'z + xz + wxy'
2.) F(w, x, y, z) = ∑(1,2,5,7,12) + d(0,9,13)
My answer:
Prime Implicants: w'x'z', y'z, w'xz, wxy', w'x'y'
Essential Prime Implicants: w'x'z', w'xz, wxy'
Possible Minimal Expression(s): w'x'z' + w'xz + wxy'
Answer sheet (professor's answer):
Prime Implicants: w'x'z', y'z, w'xz, wxy', w'x'y'
Essential Prime Implicants: w'x'z', w'xz, wxy'
Possible Minimal Expression(s): w'x'z' + w'xz + wxy' + y'z
I suppose I should add that I asked my professor after he returned my assignment to me if he had made a mistake and explained my point of view. He seemed pretty certain that he was correct, but couldn't really explain why because he speaks poor English (well, that's university for you..).
Thanks in advance to anyone who can help! This has been quite a task to try to figure out on my own!
1.) You are correct: XY is no essential prime implicant. It does not cover any minterm which is not covered by other prime implicants. Thus, is can be removed from the solution.
The Karnaugh map might help to see this more clearly:
wx
00 01 11 10
+---+---+---+---+
00 | 0 | 0 | 1 | 0 |
+---+---+---+---+
01 | 1 | 1 | 1 | 0 |
yz +---+---+---+---+
11 | 1 | 1 | 1 | 1 |
+---+---+---+---+
10 | 0 | 0 | 0 | 0 |
+---+---+---+---+
I am not sure what is meant by "possible minimal expressions". If you enumerate all potential encircled blocks in the map, XY would also be one.
2.) Your solution and the official solution are the same.
Again - like in 1.) - the solution sheet also includes the non-essential terms as "possible minimal expressions".
F = w x y' + w' x z + w' x' z' + w' x' y'
1.) F(w, x, y, z) = ∑(1,3,5,7,11,12,13,15)
wx
00 01 11 10
+---+---+---+---+
00 | 0 | 0 | 1 | 0 |
+---+---+---+---+
01 | 1 | 1 | 1 | 0 |
yz +---+---+---+---+
11 | 1 | 1 | 1 | 1 |
+---+---+---+---+
10 | 0 | 0 | 0 | 0 |
+---+---+---+---+
note : here essential prime implicants are the prime implicants which are formed by
wxyz
1100
1101
result is wxy'
if you compute the prime implicant which is formed 3, 7, 11 and 15
wxyz
0011
0111
1111
1011
result is yz
if you compute the prime implicant which is formed 1, 5, 3 and 7
wxyz
0001
0101
0011
0111
result is w'z
so essential prime implicants are wxy', yz and w'z
xz is not a essential prime implicant because prime implicant which is formed by 5, 13, 7, and 15 is redundant prime implicant

Constraining slope

I'm a beginner in Stata. I'm trying to run the following regression:
regress logy logI logh logL
but I would like to constrain the slope of logh to be one. Can someone tell me the command for this?
There are at least three ways to do this in Stata.
1) Use constrained linear regression:
. sysuse auto
(1978 Automobile Data)
. constraint 1 mpg = 1
. cnsreg price mpg weight, constraints(1)
Constrained linear regression Number of obs = 74
Root MSE = 2502.5449
( 1) mpg = 1
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mpg | 1 (constrained)
weight | 2.050071 .3768697 5.44 0.000 1.298795 2.801347
_cons | -46.14764 1174.541 -0.04 0.969 -2387.551 2295.256
------------------------------------------------------------------------------
2) Variable transformation (suggested by whuber in the comment above):
. gen price2 = price - mpg
. reg price2 weight
Source | SS df MS Number of obs = 74
-------------+------------------------------ F( 1, 72) = 29.59
Model | 185318670 1 185318670 Prob > F = 0.0000
Residual | 450916627 72 6262730.93 R-squared = 0.2913
-------------+------------------------------ Adj R-squared = 0.2814
Total | 636235297 73 8715552.01 Root MSE = 2502.5
------------------------------------------------------------------------------
price2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | 2.050071 .3768697 5.44 0.000 1.298795 2.801347
_cons | -46.14764 1174.541 -0.04 0.969 -2387.551 2295.256
------------------------------------------------------------------------------
3) Using a GLM model with an offset:
. glm price weight , family(gaussian) link(identity) offset(mpg)
Iteration 0: log likelihood = -683.04238
Iteration 1: log likelihood = -683.04238
Generalized linear models No. of obs = 74
Optimization : ML Residual df = 72
Scale parameter = 6262731
Deviance = 450916626.9 (1/df) Deviance = 6262731
Pearson = 450916626.9 (1/df) Pearson = 6262731
Variance function: V(u) = 1 [Gaussian]
Link function : g(u) = u [Identity]
AIC = 18.51466
Log likelihood = -683.0423847 BIC = 4.51e+08
------------------------------------------------------------------------------
| OIM
price | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | 2.050071 .3768697 5.44 0.000 1.31142 2.788722
_cons | -46.14764 1174.541 -0.04 0.969 -2348.205 2255.909
mpg | 1 (offset)
------------------------------------------------------------------------------
The glm route could also handle the log transformation of your outcome for you if you change the link and family options appropriately.