I am having an issue understanding why my Diff-in-Diff model results are different depending on how I specify the variables.
First, here is a data example:
```
dataex numeric_neighborhood month shareareabig DiD_implementation
----------------------- copy starting from the next line -----------------------
[CODE]
* Example generated by -dataex-. For more info, type help dataex
clear
input long numeric_neighborhood float(month shareareabig DiD_implementation)
2675 1 0 0
2675 4 0 0
2675 18 0 0
2675 34 0 0
2676 110 0 0
2676 111 0 0
2676 112 0 0
```
Here is how I created my treatment, time, and group variables:
```
gen treatment_time = (month>=71 & !missing(month))
gen treated_neighborhood = 1 if city_trans=="A" | city_trans=="B" | city_trans=="C" | city_trans=="D"
replace treated_neighborhood=0 if treated_neighborhood==.
gen DiD_implementation= treatment_time*treated_neighborhood
```
I ran the first mode as follows:
```
xtset numeric_neighborhood month, monthly
xtreg shareareabig i.DiD_implementation i.month, fe cluster(numeric_neighborhood)
```
And I get the following output for my main coefficient:
```
| Robust
shareareabig | Coefficient std. err. t P>|t| [95% conf. interval]
---------------------+----------------------------------------------------------------
1.DiD_implementation | -1.562495 .3421168 -4.57 0.000 -2.233382 -.8916086
```
However, when I ran the same model, but without adding an i. before my treatment and time variables, I get a different coefficient for "DiD_implementation"
```
xtset numeric_neighborhood month, monthly
xtreg shareareabig DiD_implementation month, fe cluster(numeric_neighborhood)
```
I get the following:
```
-----------------------------------------------------------------------------------
| Robust
shareareabig | Coefficient std. err. t P>|t| [95% conf. interval]
-------------------+----------------------------------------------------------------
DiD_implementation | -2.080575 .2743751 -7.58 0.000 -2.618621 -1.542529
```
I realize that both are in the same direction and statistically significant, but I am confused about the following:
1-Why is the magnitude shown in the second model is larger than the first?
2- Is one model more correct, relative to the other?
First, here is a data example:
```
dataex numeric_neighborhood month shareareabig DiD_implementation
----------------------- copy starting from the next line -----------------------
[CODE]
* Example generated by -dataex-. For more info, type help dataex
clear
input long numeric_neighborhood float(month shareareabig DiD_implementation)
2675 1 0 0
2675 4 0 0
2675 18 0 0
2675 34 0 0
2676 110 0 0
2676 111 0 0
2676 112 0 0
```
Here is how I created my treatment, time, and group variables:
```
gen treatment_time = (month>=71 & !missing(month))
gen treated_neighborhood = 1 if city_trans=="A" | city_trans=="B" | city_trans=="C" | city_trans=="D"
replace treated_neighborhood=0 if treated_neighborhood==.
gen DiD_implementation= treatment_time*treated_neighborhood
```
I ran the first mode as follows:
```
xtset numeric_neighborhood month, monthly
xtreg shareareabig i.DiD_implementation i.month, fe cluster(numeric_neighborhood)
```
And I get the following output for my main coefficient:
```
| Robust
shareareabig | Coefficient std. err. t P>|t| [95% conf. interval]
---------------------+----------------------------------------------------------------
1.DiD_implementation | -1.562495 .3421168 -4.57 0.000 -2.233382 -.8916086
```
However, when I ran the same model, but without adding an i. before my treatment and time variables, I get a different coefficient for "DiD_implementation"
```
xtset numeric_neighborhood month, monthly
xtreg shareareabig DiD_implementation month, fe cluster(numeric_neighborhood)
```
I get the following:
```
-----------------------------------------------------------------------------------
| Robust
shareareabig | Coefficient std. err. t P>|t| [95% conf. interval]
-------------------+----------------------------------------------------------------
DiD_implementation | -2.080575 .2743751 -7.58 0.000 -2.618621 -1.542529
```
I realize that both are in the same direction and statistically significant, but I am confused about the following:
1-Why is the magnitude shown in the second model is larger than the first?
2- Is one model more correct, relative to the other?
Comment