Hey everyone. Suppose we're interested in implementing an algorithm which selects the "optimal" set of controls for a certain treated unit for policy analysis. Take the dataset below (by the way I'm working with Stata 17)
In our case, hongkong is our treated unit and the other columns are our control group. The way the first step of the algorithm proceeds is as follows:
We begin by looping over our control units in DID regression (outcome only). We then calculate the predicted counterfactual for this pre-intervention period and calculate the R-squared statistic. We save the r2 to a separate frame. The first unit to be selected is the one which maximizes the r-squared statistic, which we may access in the rsquare frame. We then put this unit up to the front of our control group, and proceed.
Now, using the new control group (which for now consists of just canada), we take the average of canada and the other control units individually, estimate difference-in-differences with each of the remaining control units, and then find the unit which maximizes the r-squared statistic. In this case, it's New Zealand.
Here's where my question lies: Now, we must implement step 2 for the remaining 3 controls (australia austria denmark). That is, we must see which of these three, when added to canada and new zealand, maximizes the r2, adding it to the macro/set U. Then, we must see which of the 2 maximize the r2 statistic, and then we simply add the final control unit to set U, such that we have all units included in the set "U". How might I do this? Or at the very least, what might a good starting point be? My initial reaction is to use a while loop which continues to loop until some condition is fulfilled. Maybe, I should, at the end of the loop for step 2, check how many words there are in the `newdonors'. Once there's 0 words in `newdonors', this means that there are no more control units and then the while loop can conclude. Is that a reasonable starting point?
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float(hongkong australia newzealand austria canada denmark) .062 .04048913 .04724391 -.01308351 .01006395 -.01229182 .059 .03785692 .03875869 -.007580798 .02126387 -.003092842 .058 .02250948 .08991753 .000542671 .018919427 -.007764421 .062 .02874655 .06975085 .001180751 .02531683 -.004048589 .079 .03399039 .06019911 .02551085 .04356715 .0310944 .068 .03791937 .06255518 .019941313 .05022538 .06428 .046 .05228941 .04292477 .017087875 .06512183 .04595546 .052 .031070895 .04760897 .023035197 .06733068 .05516641 .037 .008696091 .022149924 .025292696 .0509212 .04805718 .029 .006773674 .023302693 .021849955 .03152506 .011953605 .012 .00302829 .03045487 .018319173 .018179957 .02080968 .015 .010981606 .03069211 .01345693 .015165864 .008303516 .025 .03818205 .04089288 .015387368 .007820651 .010101924 .036 .03452006 .03584749 .017335817 .011510062 .03085883 .047 .03667319 .03766511 .013595447 .02166007 .04011321 .059 .03898745 .025593406 .004195063 .02470353 .02569367 .058 .025748033 .005592703 -.001534697 .035775363 .029461836 .072 .05186257 .034571752 -.002021203 .03868772 .0398559 .061 .05928891 .03353216 .0161481 .03672911 .01764238 .014 .06342068 .018211482 .017984418 .0380428 .03139775 -.032 .06243018 .018747104 .03047369 .033733726 .033021096 -.061 .03949186 -.017352613 .032317627 .028866187 -.010453694 -.081 .04318146 -.019380176 .02719671 .0189473 .02242801 -.065 .035846256 .02692859 .02283163 .02248398 .01477213 -.029 .0409 .03889057 .02242066 .03785257 .003396192 .005 .03857499 .05541473 .028941363 .04818999 .029012986 .039 .02692009 .06664848 .03839999 .063958384 .010058184 .083 .03297373 .04101127 .03883952 .06480881 .026324544 .107 .03959653 .06054202 .036271237 .06725128 .03857407 .075 .04687658 .034375284 .030074896 .07295554 .030016804 .076 .024391215 .0207593 .016178379 .065640524 .03717389 .063 .005838784 .03254855 .011539886 .05321779 .036737546 .027 .000732203 .015890202 .005441154 .04056079 .01424755 .015 -.002025588 .05496934 -.005665725 .007522337 .007846387 -.001 .028212607 .036681067 -.004335946 -.017121905 .01250594 -.017 .03998247 .04987184 -.000616657 -.014710952 -.00049339 -.01 .03867529 .05044868 .003416942 -.011813972 -.00188065 .005 .04083484 .011498967 .010456725 .01313668 .01491368 .028 .032215476 .04828115 .01278433 .0296833 .002654192 .048 .03258446 .015707554 .010514015 .03784149 -.001912172 .041 .02694338 .02053765 .007017912 .03301423 .005125017 -.009 .02464757 .0400604 .008199689 .015919374 -.01868332 .038 .03993078 .04220926 .005746135 .025793217 -.005911494 .047 .05529072 .05431711 .008809593 .02063149 .017165432 .077 .05989317 .066612795 .013646583 .02710546 .02884403 .12 .05748491 .06818556 .017229607 .04896003 .03726172 .066 .04655641 .05111898 .02444402 .05043372 .036128163 .079 .030096613 .04093498 .02429197 .04781632 .03432884 .062 .03160237 .01501046 .02370696 .03985261 .020981267 .071 .04588263 .02387351 .0257305 .03162305 .05209525 .081 .04553443 .011341385 .026474627 .03574734 .04374102 .069 .05498263 .008914476 .032616325 .0503335 .02875244 .09 .04806678 .018647296 .03832044 .04947587 .04931609 .062 .02698179 -.009260945 .035103742 .04119911 .03880127 .064 .032730877 .011500795 .03722008 .031677015 .04183601 .066 .03857545 .036755715 .03898238 .0200051 .02980916 .055 .0580129 .03946248 .036197655 .03071206 .033133514 .062 .05951871 .05829326 .03257025 .03982709 -.007168933 .068 .05664859 .05114701 .03155845 .03474158 .013516975 .069 .04582468 .04590492 .01909501 .03812844 .02379412 .073 .027523303 .031214973 .017430725 .02921731 -.005199719 end
Code:
clear * * Example generated by -dataex-. For more info, type help dataex clear input float(hongkong australia newzealand austria canada denmark) .062 .04048913 .04724391 -.01308351 .01006395 -.01229182 .059 .03785692 .03875869 -.007580798 .02126387 -.003092842 .058 .02250948 .08991753 .000542671 .018919427 -.007764421 .062 .02874655 .06975085 .001180751 .02531683 -.004048589 .079 .03399039 .06019911 .02551085 .04356715 .0310944 .068 .03791937 .06255518 .019941313 .05022538 .06428 .046 .05228941 .04292477 .017087875 .06512183 .04595546 .052 .031070895 .04760897 .023035197 .06733068 .05516641 .037 .008696091 .022149924 .025292696 .0509212 .04805718 .029 .006773674 .023302693 .021849955 .03152506 .011953605 .012 .00302829 .03045487 .018319173 .018179957 .02080968 .015 .010981606 .03069211 .01345693 .015165864 .008303516 .025 .03818205 .04089288 .015387368 .007820651 .010101924 .036 .03452006 .03584749 .017335817 .011510062 .03085883 .047 .03667319 .03766511 .013595447 .02166007 .04011321 .059 .03898745 .025593406 .004195063 .02470353 .02569367 .058 .025748033 .005592703 -.001534697 .035775363 .029461836 .072 .05186257 .034571752 -.002021203 .03868772 .0398559 .061 .05928891 .03353216 .0161481 .03672911 .01764238 .014 .06342068 .018211482 .017984418 .0380428 .03139775 -.032 .06243018 .018747104 .03047369 .033733726 .033021096 -.061 .03949186 -.017352613 .032317627 .028866187 -.010453694 -.081 .04318146 -.019380176 .02719671 .0189473 .02242801 -.065 .035846256 .02692859 .02283163 .02248398 .01477213 -.029 .0409 .03889057 .02242066 .03785257 .003396192 .005 .03857499 .05541473 .028941363 .04818999 .029012986 .039 .02692009 .06664848 .03839999 .063958384 .010058184 .083 .03297373 .04101127 .03883952 .06480881 .026324544 .107 .03959653 .06054202 .036271237 .06725128 .03857407 .075 .04687658 .034375284 .030074896 .07295554 .030016804 .076 .024391215 .0207593 .016178379 .065640524 .03717389 .063 .005838784 .03254855 .011539886 .05321779 .036737546 .027 .000732203 .015890202 .005441154 .04056079 .01424755 .015 -.002025588 .05496934 -.005665725 .007522337 .007846387 -.001 .028212607 .036681067 -.004335946 -.017121905 .01250594 -.017 .03998247 .04987184 -.000616657 -.014710952 -.00049339 -.01 .03867529 .05044868 .003416942 -.011813972 -.00188065 .005 .04083484 .011498967 .010456725 .01313668 .01491368 .028 .032215476 .04828115 .01278433 .0296833 .002654192 .048 .03258446 .015707554 .010514015 .03784149 -.001912172 .041 .02694338 .02053765 .007017912 .03301423 .005125017 -.009 .02464757 .0400604 .008199689 .015919374 -.01868332 .038 .03993078 .04220926 .005746135 .025793217 -.005911494 .047 .05529072 .05431711 .008809593 .02063149 .017165432 .077 .05989317 .066612795 .013646583 .02710546 .02884403 .12 .05748491 .06818556 .017229607 .04896003 .03726172 .066 .04655641 .05111898 .02444402 .05043372 .036128163 .079 .030096613 .04093498 .02429197 .04781632 .03432884 .062 .03160237 .01501046 .02370696 .03985261 .020981267 .071 .04588263 .02387351 .0257305 .03162305 .05209525 .081 .04553443 .011341385 .026474627 .03574734 .04374102 .069 .05498263 .008914476 .032616325 .0503335 .02875244 .09 .04806678 .018647296 .03832044 .04947587 .04931609 .062 .02698179 -.009260945 .035103742 .04119911 .03880127 .064 .032730877 .011500795 .03722008 .031677015 .04183601 .066 .03857545 .036755715 .03898238 .0200051 .02980916 .055 .0580129 .03946248 .036197655 .03071206 .033133514 .062 .05951871 .05829326 .03257025 .03982709 -.007168933 .068 .05664859 .05114701 .03155845 .03474158 .013516975 .069 .04582468 .04590492 .01909501 .03812844 .02379412 .073 .027523303 .031214973 .017430725 .02921731 -.005199719 end cls * Creates the r-squared frame indexed to each individual unit mkf rsquare * We only need one row frame rsquare: set obs 1 * Our time variable g time = _n, b(hongkong) * Gets the list of variable names qui ds ** Our time column loc temp: word 1 of `r(varlist)' loc time: disp "`temp'" ** Our treated unit loc t: word 2 of `r(varlist)' loc treated_unit: disp "`t'" loc a: word 3 of `r(varlist)' * Our first control unit loc donor_one: disp "`a'" local nwords : word count `r(varlist)' loc b: word `nwords' of `r(varlist)' * Our last control unit loc last_donor: disp "`b'" *** Step 1: Initial Selection Loop /* We begin by looping over our controls in regression. */ qui foreach i of var `donor_one'-`last_donor' { cap drop cfp constraint define 1 `i' = 1 qui cnsreg `treated_unit' `i' if `time' < 45, constraint(1) // Calculating our rsquared statistic for the i-th model qui predict cfp if e(sample) qui corr `treated_unit' cfp if e(sample) frame rsquare: g `i' = r(rho)^2 cap drop cfp } ** Step 1b: now we select the unit with the highest r-squared statistic frame rsquare { qui ds loc donors `r(varlist)' qui egen max_value = rowmax(*) qui gen max_var = "" * Loop through each column and update max_var qui foreach var of varlist `donors' { replace max_var = "`var'" if `var' == max_value } loc colmax : di max_var[1] loc U: di "`colmax'" di "First selected unit is `U'" // In this case it's canada } frame drop rsquare cls
Now, using the new control group (which for now consists of just canada), we take the average of canada and the other control units individually, estimate difference-in-differences with each of the remaining control units, and then find the unit which maximizes the r-squared statistic. In this case, it's New Zealand.
Code:
// Step 2: DID Step ******* ******* order `U', a(`treated_unit') mkf rsquare frame rsquare: set obs 1 loc newdonors : list donors - U di "`newdonors'" local nwords : word count `newdonors' loc temp: word 1 of `newdonors' // Time loc donor_one: disp "`temp'" loc last_donor: word `nwords' of `newdonors' di "The last donor in the set is `last_donor'" /* In this step, we loop through the REMAINING donors. In this case, australia and austria. */ foreach i of var `donor_one'-`last_donor' { // These must be created each time cap drop cfp cap drop ym * We take the average of controls, using the U selected group and the new donor `v' egen ym = rowmean(`U' `i') constraint define 1 ym = 1 qui cnsreg `treated_unit' ym if `time' < 45, constraint(1) qui predict cfp if e(sample) qui corr `treated_unit' cfp if e(sample) qui frame rsquare: g `i' = r(rho)^2 cap drop cfp if `i'==`last_donor' { frame rsquare { cap drop max_value cap drop max_var qui ds loc donors `r(varlist)' egen max_value = rowmax(*) gen max_var = "" * Loop through each column and update max_var qui foreach var of varlist `donors' { replace max_var = "`var'" if `var' == max_value } loc colmax : di max_var[1] di "Our next optimal donor is: `colmax'" local selectedunit `U' `colmax' loc newdonors : list donors - selectedunit di "Here are our new controls: `newdonors'" } } }
Comment