Dear all,
My question is related to those in here and here. Yet, I believe that none of them addresses entirely my point.
I am trying to identify and estimate the average treatment effect on the treated (ATT) in a panel data context. My settings is as follows: I have a panel of about 1600 single individual that I observe over four different years (2000, 2001, 2002 and 2003). Different individuals get treated in different time periods and treatment takes place at the individual level. In 2000 nobody is treated. In 2001 about 450 individuals get treated, in 2002 about 300 and in 2003 about 100. The remaining individuals do not get any treatment. Once an individual gets treated, he/she stays in the treatment group for all subsequent years.
My goal is to estimate the ATT for each group of treated units. In particular, I do not want to assume that the treatment effect keeps constant over time. Hence, I need to estimate the following six parameters: (1) ATT in 2001, 2002 and 2003 for those who get treated in 2001, (2) ATT in 2002 and 2003 for those who get treated in 2002 and (3) ATT in 2003 for those who get treated in 2003.
To this purpose, I have a variable Time = {2000, 2001, 2002, 2003}, a variable Treatment = 1 in all year for treated units and 0 in all years for untreated units and a variable First_treatment = {0, 2001, 2002, 2003} depending on whether an individual moves. First_treatment = 0 for non treated, First_treatment = 2001 for those who move in 2001 and so on. y is my outcome of interest.
In order to identify all the parameters of interest, I am now doing the following:
This is the output that Stata gives me for the cohort of people treated in 2001 (the outputs for the other two cohorts are similar):
My questions are:
1) I am wondering how should I interprete the coefficients of Time#Treatment. Consider 2001.1 as an example. Does it mean that, after controlling for time and individual effects, the difference in y between treated and non treated in 2001 is about .003? Notice that the dummy for 2000 is dropped out. Does this affect the way I should interprete coefficients?
2) Is there a way to estimate my 6 parameters of interest without splitting the data? Can I for example code a variable indicating "period since treatment started"? If yes, how?
Thanks a lot in advance for any answer.
My question is related to those in here and here. Yet, I believe that none of them addresses entirely my point.
I am trying to identify and estimate the average treatment effect on the treated (ATT) in a panel data context. My settings is as follows: I have a panel of about 1600 single individual that I observe over four different years (2000, 2001, 2002 and 2003). Different individuals get treated in different time periods and treatment takes place at the individual level. In 2000 nobody is treated. In 2001 about 450 individuals get treated, in 2002 about 300 and in 2003 about 100. The remaining individuals do not get any treatment. Once an individual gets treated, he/she stays in the treatment group for all subsequent years.
My goal is to estimate the ATT for each group of treated units. In particular, I do not want to assume that the treatment effect keeps constant over time. Hence, I need to estimate the following six parameters: (1) ATT in 2001, 2002 and 2003 for those who get treated in 2001, (2) ATT in 2002 and 2003 for those who get treated in 2002 and (3) ATT in 2003 for those who get treated in 2003.
To this purpose, I have a variable Time = {2000, 2001, 2002, 2003}, a variable Treatment = 1 in all year for treated units and 0 in all years for untreated units and a variable First_treatment = {0, 2001, 2002, 2003} depending on whether an individual moves. First_treatment = 0 for non treated, First_treatment = 2001 for those who move in 2001 and so on. y is my outcome of interest.
In order to identify all the parameters of interest, I am now doing the following:
Code:
foreach t in 2001 2002 2003 { use ".../mydata.dta", replace xtset ID Time keep if First_treatment== `t' | First_treatment == 0 eststo: qui xtreg y i.Time##Treatment, fe robust }
My questions are:
1) I am wondering how should I interprete the coefficients of Time#Treatment. Consider 2001.1 as an example. Does it mean that, after controlling for time and individual effects, the difference in y between treated and non treated in 2001 is about .003? Notice that the dummy for 2000 is dropped out. Does this affect the way I should interprete coefficients?
2) Is there a way to estimate my 6 parameters of interest without splitting the data? Can I for example code a variable indicating "period since treatment started"? If yes, how?
Thanks a lot in advance for any answer.