Hey everyone. Happy to finally be sharing software for Stata once more! Been using Python these days. Anyways, I've developed the Forward DID command for Stata. The help file, ado file, as well as two sample datasets are available at my website (since it's still under development, I won't be sending it to ssc just yet, so you'll need to put it at your path for new commands manually, unless there's a way to do this I'm unaware of). At present, it should work for all Statas above and including 16, as it uses frames. There are no special libraries or additional commands the user needs, and it is written entirely in Stata's ado language.
Forward DD comes in handy when we wish to estimate the average treatment effect on the treated for one or more units, but we don't know what the most relevant ones are. It uses a variant of the forward selection algorithm (which daniel klein was most helpful in giving suggestions for the underlying code) to select the optimal control group for a treated unit. We select the optimal control group based on the pre-intervention outcome data. for our control units. After we select the control group, we estimate the ATT and 95% CIs following the method described in the original paper. At present it only is automated for one treated unit, however, if you know enough about the developments in DD, you can likely extend this to multiple treated units with a little dynamic adjustment for the control group).
As usual, feedback and comments are most appreciated. For an example of how it works, we can do
This returns the following frame (the cfframe) :
Here we have the counterfactual for the Basque Country had terrorism not occurred, and we also have the observed values. The counterfactual is a convex, uniform combination of the states Cataluna and Aragon, replicating the original findings from the first paper describing the synthetic control method. Please, do let me know how you like it (if you do!).
Forward DD comes in handy when we wish to estimate the average treatment effect on the treated for one or more units, but we don't know what the most relevant ones are. It uses a variant of the forward selection algorithm (which daniel klein was most helpful in giving suggestions for the underlying code) to select the optimal control group for a treated unit. We select the optimal control group based on the pre-intervention outcome data. for our control units. After we select the control group, we estimate the ATT and 95% CIs following the method described in the original paper. At present it only is automated for one treated unit, however, if you know enough about the developments in DD, you can likely extend this to multiple treated units with a little dynamic adjustment for the control group).
As usual, feedback and comments are most appreciated. For an example of how it works, we can do
Code:
u "agbasque.dta", clear qui fdid gdpcap, tr(treat) gr1opts(scheme(sj) name(ag, replace)) cwf cfframe
Code:
* Example generated by -dataex-. For more info, type help dataex clear input double(year gdpcap5) float(cf te) 1955 3.853184630005267 3.75793 .0952546 1956 3.9456582961508766 3.90803 .03762826 1957 4.033561734872626 4.0553446 -.021782847 1958 4.023421896896646 4.097583 -.07416092 1959 4.013781968405232 4.1396422 -.12586027 1960 4.285918396222732 4.401853 -.11593468 1961 4.574336095797406 4.677667 -.10333104 1962 4.898957353563045 4.938842 -.03988494 1963 5.197014981629133 5.187985 .009029562 1964 5.3389029787527225 5.259322 .07958081 1965 5.465153005251848 5.324697 .14045647 1966 5.545915627064143 5.448125 .09779026 1967 5.614895726639487 5.563021 .05187454 1968 5.8521849330715785 5.79924 .0529453 1969 6.0814054173695915 6.0361 .0453055 1970 6.17009424134957 6.171775 -.0016810996 1971 6.283633404546246 6.315913 -.032279797 1972 6.5555553986528405 6.6104 -.0548448 1973 6.810768561103078 6.90096 -.09019189 1974 7.105184302810804 7.055095 .05008958 1975 7.377891682175629 7.20316 .1747319 1976 7.232933621922754 7.27621 -.04327621 1977 7.089831372119127 7.344905 -.25507352 1978 6.786703607144611 7.312414 -.5257106 1979 6.6398173868571035 7.322126 -.6823086 1980 6.562839171369564 7.367006 -.8041667 1981 6.50078545499277 7.436914 -.9361285 1982 6.545058606999563 7.550632 -1.0055734 1983 6.595329801139407 7.669598 -1.0742679 1984 6.761496750091492 7.768819 -1.0073225 1985 6.937160671727721 7.872968 -.9358075 1986 7.332191151300521 8.342334 -1.0101427 1987 7.742788123594152 8.811522 -1.0687335 1988 8.12053664075889 9.270319 -1.1497823 1989 8.509711162324157 9.724476 -1.2147647 1990 8.776777889074104 9.961907 -1.1851295 1991 9.02527866619582 10.199697 -1.1744179 1992 8.873892824706335 9.992613 -1.11872 1993 8.718223539089278 9.781245 -1.0630217 1994 9.018137849286365 10.13043 -1.1122934 1995 9.440873861653367 10.433558 -.9926846 1996 9.68651813767495 10.676703 -.9901853 1997 10.170665872808662 11.12229 -.9516248 end format %ty year
Here we have the counterfactual for the Basque Country had terrorism not occurred, and we also have the observed values. The counterfactual is a convex, uniform combination of the states Cataluna and Aragon, replicating the original findings from the first paper describing the synthetic control method. Please, do let me know how you like it (if you do!).
Comment