Thanks to Kit Baum, my first command scul is now available on SSC! Before I go over a little of what it can do, I want to thank Andrew Musau, FernandoRios , Bjarte Aagnes , daniel klein , Damian Clarke and many others who provided technical or substantive feedback over the course of me writing this.
SCUL stands for Synthetic Controls Using LASSO, but the name is sort of a misnomer- it is, in fact, now a very robust command which fits a generalized class of elastic net models for causal inference. Let's look at some examples. We can begin with the classic 2003 example of terrorism in the Basque Country, where we study how terrorist attacks in the Basque Country affected GDP per capita. Note that you must, in addition to the ones I check for, have tabstatmat installed to maximize the use of scul.
Compared the the original SCM method, SCUL gets better pre-intervention fit (marginally), and, more importantly, does not need covariates to obtain the similar pre-intervention fit. It also obtains a very similar average treatment effect, and even selects the same donors as the original synthetic control estimator, meaning we can do causal inference in settings where we do not adjust for additional predictors.
What about settings where we don't both don't have additional predictors, and there's no easy comparison unit of interest? The original case study of Prop 99 using SCM and compared California to 38 states which did not do an anti-tobacco program. However, we could argue (as I do) that this may, in this instance, be a little absurd. California is humongous, and would be the 7th largest economy in the world were it a nation of its own. What if instead of comparing it to states, we compare it to the mainland divisions of the United States, which are comparable in population and/or size to California?
We see that the pre-intervention fit is 1.64, and the treatment effect is -21.4. The original fit was about 1.75 and the original treatment effect adjusted for 4 covariate predictors of smoking rates per capita, returning a treatment effect of around -19. As above, we obtain very similar pre-intervention fit, as well as a similar effect even though I don't use (and don't need to use) additional covariate predictors of smoking rates.
scul also works when we have multiple treated units that are treated across multiple points in time, something which has only recently been addressed in the SCM literature. Note that the treatment must be once-treated, always treated. To those of us who live in the United States and have high gas prices, this case study may amuse you. Georgia, Connecticut, and Maryland all passed gas tax holidays recently. To see how these tax holidays impacted prices, we do
Where we generated the average treatment effects on the treated for all treated units, which is balanced in event-time one month pre and post the gas tax holiday.
This is a little of what it can do. Please, do post here if you notice any bugs or suggestions, and please do ask any questions you might have (I've already noticed a few bugs!). Happy causal inferenc-ing.
SCUL stands for Synthetic Controls Using LASSO, but the name is sort of a misnomer- it is, in fact, now a very robust command which fits a generalized class of elastic net models for causal inference. Let's look at some examples. We can begin with the classic 2003 example of terrorism in the Basque Country, where we study how terrorist attacks in the Basque Country affected GDP per capita. Note that you must, in addition to the ones I check for, have tabstatmat installed to maximize the use of scul.
Code:
ssc inst scul, replace webuse set http://fmwww.bc.edu/repec/bocode/s/ webuse scul_basque, clear cls scul gdp, ahead(3) /// trdate(1975) /// trunit(5) /// lamb(lopt) /// obscol(black) /// cfcol(red) /// legpos(4)
What about settings where we don't both don't have additional predictors, and there's no easy comparison unit of interest? The original case study of Prop 99 using SCM and compared California to 38 states which did not do an anti-tobacco program. However, we could argue (as I do) that this may, in this instance, be a little absurd. California is humongous, and would be the 7th largest economy in the world were it a nation of its own. What if instead of comparing it to states, we compare it to the mainland divisions of the United States, which are comparable in population and/or size to California?
Code:
webuse scul_p99_region, clear cls scul cigsale, /// ahead(1) /// trdate(1989) /// trunit(3) /// lamb(lopt) /// obscol(black) /// cfcol(blue) /// legpos(7) q(1) cv(adaptive) //
scul also works when we have multiple treated units that are treated across multiple points in time, something which has only recently been addressed in the SCM literature. Note that the treatment must be once-treated, always treated. To those of us who live in the United States and have high gas prices, this case study may amuse you. Georgia, Connecticut, and Maryland all passed gas tax holidays recently. To see how these tax holidays impacted prices, we do
Code:
webuse set http://fmwww.bc.edu/repec/bocode/g/ webuse Gas_Holiday, clear loc int_time = td(24mar2022) // td(18mar2022) MD // td(24mar2022) GA // td(02apr2022) CT cls scul regular, /// ahead(28) /// trdate(`int_time') /// trunit(11) /// lamb(lopt) /// obscol(black) /// cfcol(red) /// legpos(7) /// before(28) after(28) /// multi tr(treat) /// donadj(et) /// intname("Gas Holiday") /// rellab(-28(7)28) cv(adaptive)
This is a little of what it can do. Please, do post here if you notice any bugs or suggestions, and please do ask any questions you might have (I've already noticed a few bugs!). Happy causal inferenc-ing.
Comment