Hi,
First of all a big 'thank you' to the authors of the sdid command, it really is easy to use.
I have a query about the event-study plot that is described in the Stata paper introducing this new command by Clarke et al. (2023).
My problem is that in my application the estimated 'overall' ATT (the one in the printed output table) is 3 times higher than the values that I obtain when plotting the ATT by year using the event-study plot mentioned above.
What I have noticed is that while using controls makes a big difference in terms of the estimated 'overall' ATT, it does not make that much of a difference when producing the study event plot.
To make this point I will use the same application used in the Stata paper.
Let's start with the scenario that does not control for covariates:
In this case the estimated 'overall' ATT (6.85377) is essentially identical to the average of the ATTs by year (6.8537698)
However, in the scenario with controls:
The estimated 'overall' ATT is bigger (7.11653) than the average of the ATTs by year (6.8687816) which is almost as the same level as the scenario without covariates (6.8537698).
In my application this issue is exacerbated, i.e. the average of the ATTs by year is a lot lower than the estimated 'overall' ATT.
Am I misunderstanding anything?
Should we actually expect the estimated 'overall' ATT to look similar to the average of the ATTs by year?
Perhaps Damian Clarke can help with this?
Many thanks,
Lukas
First of all a big 'thank you' to the authors of the sdid command, it really is easy to use.
I have a query about the event-study plot that is described in the Stata paper introducing this new command by Clarke et al. (2023).
My problem is that in my application the estimated 'overall' ATT (the one in the printed output table) is 3 times higher than the values that I obtain when plotting the ATT by year using the event-study plot mentioned above.
What I have noticed is that while using controls makes a big difference in terms of the estimated 'overall' ATT, it does not make that much of a difference when producing the study event plot.
To make this point I will use the same application used in the Stata paper.
Let's start with the scenario that does not control for covariates:
Code:
webuse set www.damianclarke.net/stata/ webuse quota_example.dta, clear egen m=min(year) if quota==1, by(country) //indicator for the year of adoption egen mm=mean(m), by(country) keep if mm==2002 | mm==. //keep only one time of adoption drop if lngdp==. sdid womparl country year quota, vce(noinference) graph g2_opt(ylab(-5(5)20) /// ytitle("Women in Parliament") scheme(sj)) matrix lambda = e(lambda)[1..12,1] //save lambda weight matrix yco = e(series)[1..12,2] //control baseline matrix ytr = e(series)[1..12,3] //treated baseline matrix aux = lambda'*(ytr - yco) //calculate the pre-treatment mean scalar meanpre_o = aux[1,1] matrix difference = e(difference)[1..26,1..2] // Store Ytr-Yco svmat difference ren (difference1 difference2) (time d) replace d = d - meanpre_o // Calculate vector in (8) gen y=time>=2002 & time!=. bys y: egen d_mean=mean(d) sort time d preserve keep if time==2002 di d_mean restore
However, in the scenario with controls:
Code:
webuse set www.damianclarke.net/stata/ webuse quota_example.dta, clear egen m=min(year) if quota==1, by(country) //indicator for the year of adoption egen mm=mean(m), by(country) keep if mm==2002 | mm==. //keep only one time of adoption drop if lngdp==. sdid womparl country year quota, vce(noinference) graph g2_opt(ylab(-5(5)20) /// ytitle("Women in Parliament") scheme(sj)) /// covariates(lngdp lnmmrt, projected) matrix lambda = e(lambda)[1..12,1] //save lambda weight matrix yco = e(series)[1..12,2] //control baseline matrix ytr = e(series)[1..12,3] //treated baseline matrix aux = lambda'*(ytr - yco) //calculate the pre-treatment mean scalar meanpre_o = aux[1,1] matrix difference = e(difference)[1..26,1..2] // Store Ytr-Yco svmat difference ren (difference1 difference2) (time d) replace d = d - meanpre_o // Calculate vector in (8) gen y=time>=2002 & time!=. bys y: egen d_mean=mean(d) sort time d preserve keep if time==2002 di d_mean restore
In my application this issue is exacerbated, i.e. the average of the ATTs by year is a lot lower than the estimated 'overall' ATT.
Am I misunderstanding anything?
Should we actually expect the estimated 'overall' ATT to look similar to the average of the ATTs by year?
Perhaps Damian Clarke can help with this?
Many thanks,
Lukas
Comment