Today I was trying to generate standard error estimates for regression coefficients. (This was just to walk some students through the minutiae.) But I did not reproduce Stata's reported estimates. I tried on a smaller dataset and had the same problem. My code and the (smaller) dataset are below. Differences are small-ish I know, but still. Did I miss something here?
motherinches studentinches ideology male momsed
62 62 6 0 3
63 66 7 0 3
65 75 6 1 3
64 66 4 0 3
68 69 3 1 3
65 73 4 1 3
62 64 3 0 1
65 66 6 0 1
62 65 3 0 1
66 71 6 1 1
67 69 3 1 1
61 62 3 1 1
64 62 2 0 4
64 62 7 0 4
64 73 3 1 4
63 62 3 0 4
68 73 6 1 4
62 67 3 1 4
66 68 1 1 4
63 72 7 1 4
62 70 4 1 4
60 62 2 0 4
66 74 7 1 4
65 62 2 0 4
62 65 2 0 4
68 62 4 0 2
66 68 5 1 2
68 62 3 0 2
61 62 5 0 2
60 62 5 0 2
Code and reported output
use "C:\Users\las02013\Dropbox\classes\stats2\height_m _sample.dta", replace
reg studentinches motherinches male momsed ideology
Source | SS df MS Number of obs = 30
-------------+---------------------------------- F(4, 25) = 18.05
Model | 422.99631 4 105.749077 Prob > F = 0.0000
Residual | 146.470357 25 5.85881427 R-squared = 0.7428
-------------+---------------------------------- Adj R-squared = 0.7016
Total | 569.466667 29 19.6367816 Root MSE = 2.4205
------------------------------------------------------------------------------
studentinc~s | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
motherinches | .2451402 .1982764 1.24 0.228 -.1632177 .6534982
male | 6.28862 .9532934 6.60 0.000 4.325276 8.251965
momsed | .4846678 .3805885 1.27 0.215 -.2991689 1.268505
ideology | .6433369 .2544839 2.53 0.018 .1192176 1.167456
_cons | 43.82337 12.70533 3.45 0.002 17.65625 69.9905
------------------------------------------------------------------------------
predict yhat
generate double error= studentinches- yhat
gen double sq_er= error^2
egen double sse=sum(sq_er)
gen double mse= sse/(e(N)-e(df_m)-1)
*1 get the mean of each X
egen double mean_mother=mean(motherinches)
for var ideology male momsed : egen double mean_X=mean(X)
*2.compute the sum of squared deviations of each X
egen double ssd_mother=sum((motheri-mean_mother)^2)
for var ideology male momsed: egen double ssd_X=sum((X-mean_X)^2)
*3. generate the se for each X in the model
gen double se_mother= (mse/ssd_mother)^.5
for var ideology male momsed: gen double se_X= (mse/ssd_X)^.5
list se* in 1
+-----------------------------------------------+
| se_mother se_ideology se_male se_momsed |
|-----------------------------------------------|
| .18571657 .25212609 .88581157 .37588515 |
+-----------------------------------------------+
motherinches studentinches ideology male momsed
62 62 6 0 3
63 66 7 0 3
65 75 6 1 3
64 66 4 0 3
68 69 3 1 3
65 73 4 1 3
62 64 3 0 1
65 66 6 0 1
62 65 3 0 1
66 71 6 1 1
67 69 3 1 1
61 62 3 1 1
64 62 2 0 4
64 62 7 0 4
64 73 3 1 4
63 62 3 0 4
68 73 6 1 4
62 67 3 1 4
66 68 1 1 4
63 72 7 1 4
62 70 4 1 4
60 62 2 0 4
66 74 7 1 4
65 62 2 0 4
62 65 2 0 4
68 62 4 0 2
66 68 5 1 2
68 62 3 0 2
61 62 5 0 2
60 62 5 0 2
Code and reported output
use "C:\Users\las02013\Dropbox\classes\stats2\height_m _sample.dta", replace
reg studentinches motherinches male momsed ideology
Source | SS df MS Number of obs = 30
-------------+---------------------------------- F(4, 25) = 18.05
Model | 422.99631 4 105.749077 Prob > F = 0.0000
Residual | 146.470357 25 5.85881427 R-squared = 0.7428
-------------+---------------------------------- Adj R-squared = 0.7016
Total | 569.466667 29 19.6367816 Root MSE = 2.4205
------------------------------------------------------------------------------
studentinc~s | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
motherinches | .2451402 .1982764 1.24 0.228 -.1632177 .6534982
male | 6.28862 .9532934 6.60 0.000 4.325276 8.251965
momsed | .4846678 .3805885 1.27 0.215 -.2991689 1.268505
ideology | .6433369 .2544839 2.53 0.018 .1192176 1.167456
_cons | 43.82337 12.70533 3.45 0.002 17.65625 69.9905
------------------------------------------------------------------------------
predict yhat
generate double error= studentinches- yhat
gen double sq_er= error^2
egen double sse=sum(sq_er)
gen double mse= sse/(e(N)-e(df_m)-1)
*1 get the mean of each X
egen double mean_mother=mean(motherinches)
for var ideology male momsed : egen double mean_X=mean(X)
*2.compute the sum of squared deviations of each X
egen double ssd_mother=sum((motheri-mean_mother)^2)
for var ideology male momsed: egen double ssd_X=sum((X-mean_X)^2)
*3. generate the se for each X in the model
gen double se_mother= (mse/ssd_mother)^.5
for var ideology male momsed: gen double se_X= (mse/ssd_X)^.5
list se* in 1
+-----------------------------------------------+
| se_mother se_ideology se_male se_momsed |
|-----------------------------------------------|
| .18571657 .25212609 .88581157 .37588515 |
+-----------------------------------------------+
Comment