Different DiD results with xtreg (fe) vs. xtreg with i.dummy variables

Paolo Maldini

Join Date: Feb 2022

Posts: 49
#1

Different DiD results with xtreg (fe) vs. xtreg with i.dummy variables

23 Apr 2022, 10:58

I am having an issue understanding why my Diff-in-Diff model results are different depending on how I specify the variables.

First, here is a data example:
```
dataex numeric_neighborhood month shareareabig DiD_implementation

----------------------- copy starting from the next line -----------------------
[CODE]
* Example generated by -dataex-. For more info, type help dataex
clear
input long numeric_neighborhood float(month shareareabig DiD_implementation)
2675 1 0 0
2675 4 0 0
2675 18 0 0
2675 34 0 0
2676 110 0 0
2676 111 0 0
2676 112 0 0
```

Here is how I created my treatment, time, and group variables:

```
gen treatment_time = (month>=71 & !missing(month))
gen treated_neighborhood = 1 if city_trans=="A" | city_trans=="B" | city_trans=="C" | city_trans=="D"
replace treated_neighborhood=0 if treated_neighborhood==.
gen DiD_implementation= treatment_time*treated_neighborhood
```

I ran the first mode as follows:
```
xtset numeric_neighborhood month, monthly
xtreg shareareabig i.DiD_implementation i.month, fe cluster(numeric_neighborhood)
```

And I get the following output for my main coefficient:
```
| Robust
shareareabig | Coefficient std. err. t P>|t| [95% conf. interval]
---------------------+----------------------------------------------------------------
1.DiD_implementation | -1.562495 .3421168 -4.57 0.000 -2.233382 -.8916086
```

However, when I ran the same model, but without adding an i. before my treatment and time variables, I get a different coefficient for "DiD_implementation"

```
xtset numeric_neighborhood month, monthly
xtreg shareareabig DiD_implementation month, fe cluster(numeric_neighborhood)
```

I get the following:
```
-----------------------------------------------------------------------------------
| Robust
shareareabig | Coefficient std. err. t P>|t| [95% conf. interval]
-------------------+----------------------------------------------------------------
DiD_implementation | -2.080575 .2743751 -7.58 0.000 -2.618621 -1.542529
```

I realize that both are in the same direction and statistically significant, but I am confused about the following:

1-Why is the magnitude shown in the second model is larger than the first?
2- Is one model more correct, relative to the other?

Last edited by Paolo Maldini; 23 Apr 2022, 11:34.
Tags: data, panel, panel data, regression
Clyde Schechter

Join Date: Apr 2014

Posts: 29816
#2

23 Apr 2022, 11:15

The two ways of coding these variables produce two very different models. This DID_implementation variable, if it is a 0/1 variable (can't tell from the limited example data shown) is not affected. But the month variable is very different. With the i. prefix, you are asking Stata to model idiosyncratic monthly shocks to the outcome variable. Without the i.prefix you are telling Stata to model a linear trend over time in the outcome variable. Those are very different models. And they lead to different results. The fact that the linear trend model leaves a larger coefficient for the DID_implementation variable suggests that there are monthly shocks that, when left unaccounted for (a linear trend model can't account for them) are associated with the value of the DID_implementation variable, and the monthly shock model can separate that out.

As to which is correct, that depends on what you believe to be the behavior over time of the outcome variable in the real world. I can't help you with that question.

Having answered your questions, I have a question for you. What is that month variable? How did you code it? Interpreted as a Stata internal format monthly variable, it appears you are working with data from the 1960's. That's possible and perfectly OK, of course, but we don't see much of that these days. So I'm wondering about the validity of that variable in the first place.

Last edited by Clyde Schechter; 23 Apr 2022, 11:18.
1 like
Comment
Paolo Maldini

Join Date: Feb 2022

Posts: 49
#3

23 Apr 2022, 11:40

Originally posted by Clyde Schechter View Post

The two ways of coding these variables produce two very different models. This DID_implementation variable, if it is a 0/1 variable (can't tell from the limited example data shown) is not affected. But the month variable is very different. With the i. prefix, you are asking Stata to model idiosyncratic monthly shocks to the outcome variable. Without the i.prefix you are telling Stata to model a linear trend over time in the outcome variable. Those are very different models. And they lead to different results. The fact that the linear trend model leaves a larger coefficient for the DID_implementation variable suggests that there are monthly shocks that, when left unaccounted for (a linear trend model can't account for them) are associated with the value of the DID_implementation variable, and the monthly shock model can separate that out.

As to which is correct, that depends on what you believe to be the behavior over time of the outcome variable in the real world. I can't help you with that question.

Having answered your questions, I have a question for you. What is that month variable? How did you code it? Interpreted as a Stata internal format monthly variable, it appears you are working with data from the 1960's. That's possible and perfectly OK, of course, but we don't see much of that these days. So I'm wondering about the validity of that variable in the first place.

Thanks for the thorough answer, this is super helpful!
I have updated the post to show how the time and treatment variables were generated.
I do believe there are idiosyncratic monthly shocks in the real outcome variable I am measuring, and will thus be adding the i.prefix in my model.

Lastly, on the date question, I am working date variables in the Hijri calendar, which Stata reads as ones from the 1960's when in fact it is based data between 2002-2022.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29816
#4

23 Apr 2022, 11:48

Great. Thanks for the clarifications.
1 like
Comment

Announcement

Different DiD results with xtreg (fe) vs. xtreg with i.dummy variables

Comment

Comment

Comment