Hi all,
I am working with the user-written command csdid to try to use the Callaway and Sant'Anna estimator in a setting in which there is staggered treatment adoption of units that get treated for various amounts of time (during which they are not observed). At the bottom, I try to set up the data provided in FernandoRios' example to resemble my data set-up (matched pair design, treated and untreated units -- some units missing the treatment period, and some missing the treatment period as well as the period after).
Most importantly, I am trying to have every unit's first observed period post-treatment to be viewed as t0 (or t1) from an event-study perspective. I.e. I could just run the code below and would get a result. However, in doing so Tp1 (post-treatment t1) would consist of the first post-treatment observation for those units who are only treated/unobserved for one period while Tp2 (post-treatment period t2) would consist of the second post-treatment observation for those units who are only treated/unobserved for one period and the first post-treatment observation for those unites who have been treated/unobserved for two periods. Instead I want Tp0 to be the first observed period post-treatment for everyone.
I think, the way to go about this is to separately estimate the treatment effects for each group and then manually re-aggregate them (e.g. run csdid command line once for units for which one period is missing and once for unites for which two periods are missing), but I will admit that my understanding of how to bring them back together is currently too limited to implement this. Is there perhaps a simpler way given the current command structure that I am missing or does anyone have any ideas about how to achieve this?
I hope everything is clear and thanks for any advice,
John
I am working with the user-written command csdid to try to use the Callaway and Sant'Anna estimator in a setting in which there is staggered treatment adoption of units that get treated for various amounts of time (during which they are not observed). At the bottom, I try to set up the data provided in FernandoRios' example to resemble my data set-up (matched pair design, treated and untreated units -- some units missing the treatment period, and some missing the treatment period as well as the period after).
Most importantly, I am trying to have every unit's first observed period post-treatment to be viewed as t0 (or t1) from an event-study perspective. I.e. I could just run the code below and would get a result. However, in doing so Tp1 (post-treatment t1) would consist of the first post-treatment observation for those units who are only treated/unobserved for one period while Tp2 (post-treatment period t2) would consist of the second post-treatment observation for those units who are only treated/unobserved for one period and the first post-treatment observation for those unites who have been treated/unobserved for two periods. Instead I want Tp0 to be the first observed period post-treatment for everyone.
Code:
csdid lemp2 lpop2 , ivar(countyreal) time(year) gvar(first_treat) method(dripw) estat event
I hope everything is clear and thanks for any advice,
John
Code:
***********UNDERSTANDING CSDID when having treatment gaps of varying lengths use https://friosavila.github.io/playingwithstata/drdid/mpdta.dta, clear sort county year preserve keep county treat duplicates drop // gunique county if treat==0 set seed 666 gen random = runiform() sort treat random by treat: gen counter = _n sum counter if treat==1 keep if counter<=r(max) keep county counter tempfile tmp save `tmp' restore merge m:1 countyreal using `tmp', keep(3) nogen egen pair_treat = max(first_treat), by(counter) drop if pair_treat==2007 expand 2 if year==2007, gen(test) replace year=2008 if test==1 drop test sort counter county year csdid lemp lpop , ivar(countyreal) time(year) gvar(first_treat) method(dripw) estat event csdid_plot, name(base_matched, replace) **randomly remove data for treatment period t0 or both t0 and t1 set seed 666 gen random = runiform() egen random2 = mean(random), by(counter) replace random = random2>=0.5 drop random2 clonevar lemp2 = lemp clonevar lpop2 = lpop replace lemp2 = . if random==1 & year==pair_treat replace lpop2 = . if random==1 & year==pair_treat replace lemp2 = . if random==0 & (year==pair_treat | year==pair_treat+1) replace lpop2 = . if random==0 & (year==pair_treat | year==pair_treat+1)