Diff-n-Diff (DD) Panel data with many time periods

Angelo Cozzubo

Join Date: Mar 2015

Posts: 32
#1

Diff-n-Diff (DD) Panel data with many time periods

29 Sep 2017, 12:09

Dear Statalist friends,

I was reading through some DD materials and I got confused about how to code in the case of panel data, specially with more than 2 time-periods. I was trying to follow Imbens/Wooldridge page 8, Lecture Notes 10, Summer ’07 (available at: http://www.nber.org/WNE/lect_10_diffindiffs.pdf)

I tried the following commands:

Code:

*1* From Clyde Schechter - Statalist // https://www.statalist.org/forums/forum/general-stata-discussion/general/1377739-difference-in-differences-regression xtset id xtreg outcome i.treated##i.post_treat i.year, fe *2* Bernal & Peña - Universidad de los Andes gen delta_outcome=outcome2-outcome1 // first difference between time periods reg delta_outcome i.treated *3* DIFF command diff outcome, t(treated) p(year) *4* OLS - reg reg outcome i.treated i.year i.id

Option (1) and (2) gave me the same results, but option (2) was only applicable with 2 time periods since you have to create the first difference. Could it be possible to replicate with more time periods?

The option (3) gave me different SE and T values, although the same coefficient and p-value.

The option (4) was completely different and that is why I think is wrong. Is there any way to fit a DD-panel with more than 2 periods with the reg command?

Please, I will be very grateful if you could please tell me how to specify correctly the DD-panel regression with each command to better understand what was wrong in each of them.

Thanks a lot!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

29 Sep 2017, 12:54

Methods 2 and 3 do not generalize to multiple time periods.

Method 4 is simply wrong because it does not include an interaction term and is, therefore, not a DID estimator at all.

Method 1 can be used with more than two time periods; no modifications to the code are necessary. The post_treat variable, instead of being 0/1 is a polychotomous variable encoding the time period.
Comment
Angelo Cozzubo

Join Date: Mar 2015

Posts: 32
#3

29 Sep 2017, 14:26

Thank you very much Clyde. Some follow-up question just to be completely sure:

1) Regarding method 4, could it be possible to adjust it to become a panel DD with many time periods in this way?

Code:

*4* OLS - reg reg outcome i.treated##i.year i.id

2) Regarding your response about method 1 and the post_treat variable beign polychotomous instead of 0/1.

Would not it be desirable to have only a dummy variable equals to one for all the post treatment period and 0 for all the pre treatment periods?

In case I ran it with a polychotomous variable indicating the periods for example as 0,1,2,3, etc. How could I interpret a unique DD effect? And how could I take in consideration more than one year of pre-treatment?

3) Strictly following Imbens and Wooldridge Lecture Notes and having the setup of the uploaded image where w_it is unity if unit i participates in the program at time t. Could we have the following regression command, since there is no need of the treatment dummy (dropped by collinearity of the FE maybe)?

Code:

xtset id xtreg outcome i.treated#i.post_treat i.year, fe

Thanks again!
Attached Files

Last edited by Angelo Cozzubo; 29 Sep 2017, 14:34.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#4

29 Sep 2017, 19:20

1) Regarding method 4, could it be possible to adjust it to become a panel DD with many time periods in this way?

Code:
*4* OLS - reg reg outcome i.treated##i.year i.id

Yes, you can do this, but in this case it's no different from Method 1 in post #1 of this thread. For a linear regression, whether you represent the fixed effects explicitly as i.id or implicitly by using -xtset- and -xtreg- you get the same thing. The advantage of using -xtreg- is that your output won't be cluttered up with estimates of the numerous i.id coefficients which are, typically, of no interest in any case.

Would not it be desirable to have only a dummy variable equals to one for all the post treatment period and 0 for all the pre treatment periods?

Yes, you can do that, but then your model does not distinguish the different periods from each other. You are, in effect, reducing it to a simple 2-period model with pre vs post only.

In case I ran it with a polychotomous variable indicating the periods for example as 0,1,2,3, etc. How could I interpret a unique DD effect?

You can't and you shouldn't. If there are multiple treatments, each should get its own estimated effect.

And how could I take in consideration more than one year of pre-treatment?

Just include those observations in the estimation sample. Since these years are all pre-treatment there is no treatment effect operating in any case so no need to distinguish them.

3) Strictly following Imbens and Wooldridge Lecture Notes and having the setup of the uploaded image where w_it is unity if unit i participates in the program at time t. Could we have the following regression command, since there is no need of the treatment dummy (dropped by collinearity of the FE maybe)?

This is a perfectly legitimate analysis. In the case of a simple treatment vs control pre- vs post-treatment analysis, it is actually just an algebraic transform of Method 1 from post #1 and will give the same results for the DID estimates of interest. The models differ only in how they represent time. In Method 1 from post #1, there is an explicit pre-post variable, which causes an extra year indicators to be dropped. In what you show from Imbens and Wooldridge, the pre-post variable is not used, and there is another year indicator variable. Algebraically, these models are equivalent: it's just a re-parameterization of time in the model.

Now, if we get beyond the simple treatment vs control and pre- vs post-treatment analysis, the two models are different, and they would be used for different purposes.

Method 1 from Post #1 (and using a polychotomous pre-post variable) would be used when the first treatment is applied in the first post-period, and then a new treatment is added on (or replaces the first treatment) in the second post-period, and then a third treatment is added on (or replaces the second) in the third post-period, etc. By contrast, the model from Imbens and Wooldridge would be used when treatment is in effect intermittently: there is only a single treatment, but it is no longer assumed to remain in effect indefinitely once started--it is applied, and then removed, and then applied, and then removed, in successive time periods.

Added: The Imbens and Wooldridge version could also be used in the scenario of multiple sequential treatments, but because of the way it is parameterized, it would be complicated to extract the DID estimate for each of those treatments. Using Method 1 for Post #1, it is easy to get the DID estimate for each treatment out of the -margins- command.

Last edited by Clyde Schechter; 29 Sep 2017, 19:27.
Comment
Angelo Cozzubo

Join Date: Mar 2015

Posts: 32
#5

02 Oct 2017, 10:31

Thank you very much, Clyde!
Comment
Felipe Salce

Join Date: Nov 2018

Posts: 5
#6

11 Sep 2019, 20:31

Dear Statalist,

I have the doubt that it happens with a model with the following command:

Code:

reg outcome i.treated##i.year i.id controls

As if it interprets the effect of the policy, since it has an estimator for each year, years before and after the implementation of the policy.

Attached is a simple example of this command, where the policy was applied in 1995 in some states.

2) If you want a unique DID effect, you need this model:

Code:

reg outcome i.treated##i.post_treat i.id controls

Attached Files

Last edited by Felipe Salce; 11 Sep 2019, 20:33.
Comment
lilis husna

Join Date: May 2022

Posts: 11
#7

10 May 2022, 14:01

Dear Statalist,
I want to run DiD model, based on (Bas & Strauss-Kahn, 2015)

Ordinary will code 0 for control group dan 1 for treatment group. T is tariff variabel and will be different over years (2000, 2001, 2002, 2003, 2004, 2005, 2006). Sizei,t0*αt corresponds to initial firm size trends, where the initial size of firm i is defined by the number of imported varieties. αik, αct, αst and αpt are firm-product, destination country-year, HS4-year and province-year fixed effects and ηipkct an i.i.d. component.

My questions :
1. How to run this model in Stata (I confused because there is so much fixed effect variabel and the tariff variable is not pre and post treatment (like usual DiD model that code 0 for pre-treatment and 1 for post-treatment)
2. How to chek parallel trend assumption

Please, I will be very grateful if you could please explain the answer of my question.

Thanks a lot!

Last edited by lilis husna; 10 May 2022, 14:10.
Comment

Announcement

Diff-n-Diff (DD) Panel data with many time periods

Comment

Comment

Comment

Comment

Comment

Comment