manually calculating interaction term vs use i.treat#i.post in a DiD model

Lucy Garcia

Join Date: Sep 2021

Posts: 47
#1

manually calculating interaction term vs use i.treat#i.post in a DiD model

29 Mar 2022, 22:43

Dear statalist,

I'm doing a (generalized?) DiD with firm and year fixed effects. In my sample, different firms switch to a certain status in different years, post =1 if the year is the switch year or the years after switch, treat =1 if the firm switched to a certain status in a particular year (i.e., treat =1 only for the year switch happens).

From my understanding, this two-way fixed effects model would only have the interaction term, but not the individual terms of treat and post. I tried two ways of calculating the interaction term:

1. manually calculate before running the regression

Code:

gen treat_post = treat*post xtreg y treat_post controls i.year if _weight!=. , fe robust

the regression result:

Code:

------------------------------------------------------------------------------------------ | Robust y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------------------+---------------------------------------------------------------- treat_post | .2432795 .1246816 1.95 0.052 -.0020436 .4886026

2. include i.treat#i.post in the regression

Code:

xtreg y i.treat#i.post controls i.year if _weight!=. , fe robust

the regression result:

Code:

------------------------------------------------------------------------------------------ | Robust y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------------------+---------------------------------------------------------------- treat#post | 0 1 | -.8282567 .3271798 -2.53 0.012 -1.472015 -.1844989 1 0 | 0 (empty) 1 1 | -.2441863 .239058 -1.02 0.308 -.714556 .2261834

So my questions:
1. Which one is the correct way to do a DiD with two-way fixed effects?
2. What are the 0 1, 1 0, 1 1 in the second one mean? And why one of them is empty (has no observations)?
3. when using -xtreg-, where are 3 R^2 reported: within, between and overall. Which one should I mainly focus on and report?
Here is the 3 R^2 for the first regression:

Code:

R-sq: within = 0.2939 between = 0.1037 overall = 0.1224

4. Is the first model a generalized DiD? I read https://www.statalist.org/forums/for...tion-look-like and the paper mentioned in that post https://www.annualreviews.org/doi/pd...-040617-013507, in p2 of the paper, it says "let Dgt = 1 if unit g is exposed to treatment in period t, and Dgt = 0 if unit g is exposed to the control condition in period t", and the model is Ygt = ag + bt + δDgt + εgt. So the Dgt variable in fact has two layers, 1) the unit is treated and 2) the time period now is after the treatment happens. With my limited understanding, my variable, treat_post, has a similar meaning to that of Dgt? So my first model is a generalized DiD?

Thanks a lot for your help.
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 9958
#2

30 Mar 2022, 03:58

The treatment variable is an indicator and not a series of indicators, so if you wanted to define it as an interaction, the correct way would be

Code:

xtreg y c.treat#c.post controls i.year if _weight!=. , fe robust

or

Code:

xtreg y 1.treat#c.post controls i.year if _weight!=. , fe robust

However, this is not advisable because you will not be able to use post estimation commands such as margins as Stata will find an interaction term without main effects and you cannot calculate marginal effects for interaction terms. Therefore, you should create the treatment indicator beforehand. Assuming treat is a 0/1 indicator for treated units:

Code:

gen treat_post=1.treat#c.post xtreg y i.treat_post controls i.year if _weight!=. , fe robust margins treat_post, at(year=(1/10)) marginsplot, recast(line) noci

where you replace 1/10 with the relevant time range in your sample. For the R2 statistic, as you have a fixed effects model, perhaps the within R2 is the most relevant. But the LSDV R2 is also a popular choice:

Code:

reg y i.treat_post i.panelvar controls i.year if _weight!=. , cluster(panelvar) di e(r2)

where "panelvar" is your panel identifier in the xtreg regression.

Last edited by Andrew Musau; 30 Mar 2022, 04:13.
Comment
Lucy Garcia

Join Date: Sep 2021

Posts: 47
#3

30 Mar 2022, 06:05

Hi Andrew,

Thanks a lot for your reply!

The treatment variable is an indicator and not a series of indicators

Oh yes, I suppose this is an important distinction that I fail to understand before.

I have another question. So I want a 3-way interaction, building on treat_post, say adding a high dummy, which indicates whether the value of a continuous variable of a firm is above industry average in a year or not. So this high dummy is a series of indicators, because a firm can be in the high category for more than 1 year throughout the sample period.

The code to generate this 3-way interaction should be:

Code:

gen treat_post_high=1.treat#c.post#c.high

and in the regression it should be:

Code:

xtreg y i.treat_post_high controls i.year if _weight!=. , fe robust

Does that look right?
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 9958
#4

30 Mar 2022, 12:31

No, do not interfere with the treatment indicator. Generate it, then have an indicator for the "high" firms and interact this with the treatment indicator.

Code:

gen treat_post=1.treat#c.post xtreg y i.treat_post##i.high controls i.year if _weight!=. , fe robust
Comment

Announcement

manually calculating interaction term vs use i.treat#i.post in a DiD model

Comment

Comment

Comment