Dear All,
i am trying to understand the differences in estimation results for a difference-in-differences estimation resulting from using the user written diff command versus running it 'manually' using interaction terms. To illustrate my confusion I have run the following:
use http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta
(Dataset from Card&Krueger (1994))
. diff fte, t(treated) p(t)
DIFFERENCE-IN-DIFFERENCES ESTIMATION RESULTS
Number of observations in the DIFF-IN-DIFF: 801
Baseline Follow-up
Control: 78 77 155
Treated: 326 320 646
404 397
------------------------------------------------------
Outcome var. | fte | S. Err. | t | P>|t|
----------------+---------+---------+-------+---------
Baseline | | | |
Control | 19.949 | | |
Treated | 17.065 | | |
Diff (T-C) | -2.884 | 1.135 | -2.54 | 0.011**
Follow-up | | | |
Control | 17.542 | | |
Treated | 17.573 | | |
Diff (T-C) | 0.030 | 1.143 | 0.03 | 0.979
| | | |
Diff-in-Diff | 2.914 | 1.611 | 1.81 | 0.071*
------------------------------------------------------
R-square: 0.01
* Means and Standard Errors are estimated by linear regression
**Inference: *** p<0.01; ** p<0.05; * p<0.1
. gen did=t*treated
. reg fte did treated t, vce(robust)
Linear regression Number of obs = 801
F( 3, 797) = 1.43
Prob > F = 0.2330
R-squared = 0.0080
Root MSE = 9.003
------------------------------------------------------------------------------
| Robust
fte | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
did | 2.913982 1.736818 1.68 0.094 -.4952963 6.323261
treated | -2.883534 1.403338 -2.05 0.040 -5.638209 -.1288592
t | -2.40651 1.594091 -1.51 0.132 -5.535623 .7226031
_cons | 19.94872 1.317281 15.14 0.000 17.36297 22.53447
------------------------------------------------------------------------------
. reg fte did treated t, vce(cluster id)
Linear regression Number of obs = 801
F( 3, 408) = 1.89
Prob > F = 0.1305
R-squared = 0.0080
Root MSE = 9.003
(Std. Err. adjusted for 409 clusters in id)
------------------------------------------------------------------------------
| Robust
fte | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
did | 2.913982 1.291448 2.26 0.025 .3752599 5.452705
treated | -2.883534 1.401798 -2.06 0.040 -5.639182 -.1278858
t | -2.40651 1.207109 -1.99 0.047 -4.779439 -.0335815
_cons | 19.94872 1.318071 15.13 0.000 17.35766 22.53978
------------------------------------------------------------------------------
Although the point estimate from the 3 regressions are the same they vary in the standard errors. Am not sure I understand why the standard errors are different. I would really appreciate some insight into this.
Also, I am wondering if I wanted to include a store fixed effect can I incorporate that simply as follows:
xtset id
panel variable: id (unbalanced)
. xtreg fte did treated t, fe vce(cluster id)
Fixed-effects (within) regression Number of obs = 801
Group variable: id Number of groups = 409
R-sq: within = 0.0180 Obs per group: min = 1
between = 0.0052 avg = 2.0
overall = 0.0006 max = 4
F(2,408) = .
corr(u_i, Xb) = -0.1811 Prob > F = .
(Std. Err. adjusted for 409 clusters in id)
------------------------------------------------------------------------------
| Robust
fte | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
did | 2.942513 1.318861 2.23 0.026 .3499016 5.535123
treated | 1.278744 .6594305 1.94 0.053 -.0175617 2.575049
t | -2.490132 1.23446 -2.02 0.044 -4.916828 -.0634351
_cons | 16.62192 .6164617 26.96 0.000 15.41008 17.83376
-------------+----------------------------------------------------------------
sigma_u | 8.0015503
sigma_e | 6.2117819
rho | .62395631 (fraction of variance due to u_i)
------------------------------------------------------------------------------
But in this case the estimates and the se's become very different.
I will really appreciate some help.
Sincerely,
Sumedha.
i am trying to understand the differences in estimation results for a difference-in-differences estimation resulting from using the user written diff command versus running it 'manually' using interaction terms. To illustrate my confusion I have run the following:
use http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta
(Dataset from Card&Krueger (1994))
. diff fte, t(treated) p(t)
DIFFERENCE-IN-DIFFERENCES ESTIMATION RESULTS
Number of observations in the DIFF-IN-DIFF: 801
Baseline Follow-up
Control: 78 77 155
Treated: 326 320 646
404 397
------------------------------------------------------
Outcome var. | fte | S. Err. | t | P>|t|
----------------+---------+---------+-------+---------
Baseline | | | |
Control | 19.949 | | |
Treated | 17.065 | | |
Diff (T-C) | -2.884 | 1.135 | -2.54 | 0.011**
Follow-up | | | |
Control | 17.542 | | |
Treated | 17.573 | | |
Diff (T-C) | 0.030 | 1.143 | 0.03 | 0.979
| | | |
Diff-in-Diff | 2.914 | 1.611 | 1.81 | 0.071*
------------------------------------------------------
R-square: 0.01
* Means and Standard Errors are estimated by linear regression
**Inference: *** p<0.01; ** p<0.05; * p<0.1
. gen did=t*treated
. reg fte did treated t, vce(robust)
Linear regression Number of obs = 801
F( 3, 797) = 1.43
Prob > F = 0.2330
R-squared = 0.0080
Root MSE = 9.003
------------------------------------------------------------------------------
| Robust
fte | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
did | 2.913982 1.736818 1.68 0.094 -.4952963 6.323261
treated | -2.883534 1.403338 -2.05 0.040 -5.638209 -.1288592
t | -2.40651 1.594091 -1.51 0.132 -5.535623 .7226031
_cons | 19.94872 1.317281 15.14 0.000 17.36297 22.53447
------------------------------------------------------------------------------
. reg fte did treated t, vce(cluster id)
Linear regression Number of obs = 801
F( 3, 408) = 1.89
Prob > F = 0.1305
R-squared = 0.0080
Root MSE = 9.003
(Std. Err. adjusted for 409 clusters in id)
------------------------------------------------------------------------------
| Robust
fte | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
did | 2.913982 1.291448 2.26 0.025 .3752599 5.452705
treated | -2.883534 1.401798 -2.06 0.040 -5.639182 -.1278858
t | -2.40651 1.207109 -1.99 0.047 -4.779439 -.0335815
_cons | 19.94872 1.318071 15.13 0.000 17.35766 22.53978
------------------------------------------------------------------------------
Although the point estimate from the 3 regressions are the same they vary in the standard errors. Am not sure I understand why the standard errors are different. I would really appreciate some insight into this.
Also, I am wondering if I wanted to include a store fixed effect can I incorporate that simply as follows:
xtset id
panel variable: id (unbalanced)
. xtreg fte did treated t, fe vce(cluster id)
Fixed-effects (within) regression Number of obs = 801
Group variable: id Number of groups = 409
R-sq: within = 0.0180 Obs per group: min = 1
between = 0.0052 avg = 2.0
overall = 0.0006 max = 4
F(2,408) = .
corr(u_i, Xb) = -0.1811 Prob > F = .
(Std. Err. adjusted for 409 clusters in id)
------------------------------------------------------------------------------
| Robust
fte | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
did | 2.942513 1.318861 2.23 0.026 .3499016 5.535123
treated | 1.278744 .6594305 1.94 0.053 -.0175617 2.575049
t | -2.490132 1.23446 -2.02 0.044 -4.916828 -.0634351
_cons | 16.62192 .6164617 26.96 0.000 15.41008 17.83376
-------------+----------------------------------------------------------------
sigma_u | 8.0015503
sigma_e | 6.2117819
rho | .62395631 (fraction of variance due to u_i)
------------------------------------------------------------------------------
But in this case the estimates and the se's become very different.
I will really appreciate some help.
Sincerely,
Sumedha.
Comment