This is not exactly a Stata question but rather an econometric question on a Stata implementation. Can we use staggered DID in a database that features no never takers (that is, no individuals that remain untreated all along the time observation window)? I have always thought the answer to be yes but since I have run into problems in a real world implementation, I decided to try a simple numerical simulation in Stata and I am somewhat confused by the results. Below is code for replication but let me briefly state what I've done for clarity:
- I simulate a database with I=100 individuals and T=10 periods, no missing data, all individuals are treated exactly once in a randomly chosen period
- I try to estimate the effect of the treatment on a (purely random) outcome
- First I use static TWFE (I am following the terminology in Callaway and Sant'anna 2021, this means regressing the outcome on time effects, individual effects and a treatment dummy), no problem here
- When I try dynamic TWFE (that is, I regress the outcome on time effects, individual effects, treatment lags and treatment leads, leaving the first lag out for identification), Stata has to drop one time effect due to collinearity. This problem persists after changing the command used (whether reg or reghdfe) or the way variables are defined or ordered in the command, one regressor must always be dropped, implying that collinearity is a feature of the data
- If I fabricate just one never taker, the problem goes away and dynamic TWFE works just fine
Code:
***Database generation clear all set seed 260586 set obs 100 gen i=_n expand 10 bys i: gen t=_n gen r=round(uniform()*8) // random treatment assignment bys i: gen rr=r if _n==1 replace rr=5 if rr==0 bys i: egen t_treat=mean(rr) drop r rr gen treat=t==t_treat gen treat_dif=t-t_treat // lags and leads forvalues i=1/9 { local j=10-`i' gen treat_lag`i'=treat_dif==-`i' gen treat_lead`i'=treat_dif==`i' } gen y=rnormal(0,1) // random outcome drop treat_lag8 treat_lag9 // no cases with these many lags ***Estimation xtset i t xtreg y treat i.t, fe // traditional (static) TWFE xtreg y treat_lag7 treat_lag6 treat_lag5 treat_lag4 treat_lag3 treat_lag2 treat treat_lead* i.t, fe // dynamic TWFE ***Now let's throw in one never taker and try again unab vars: treat_lag1-treat_lead9 foreach var in `vars' { replace `var'=0 in 1/10 } xtreg y treat_lag7 treat_lag6 treat_lag5 treat_lag4 treat_lag3 treat_lag2 treat treat_lead* i.t, fe // dynamic TWFE
Comment