  • Using Synthetic Control Weights in a Difference in Differences model (SDID)

    Hi everyone,

    I am currently conducting a synthetic control (sc) model and would now like to use my sc's in a diff in diff, hence a synthetic diff in diff (sdid). I cannot find code or help on how to run this in stata and am now looking for help here. Further on I am presenting one of my sc regressions and the output I received so you get a better understanding of my data.
    I am using panel data which I collapsed to district level. I have three units (districts) and 5 time periods (1993, 1998, 2003, 2008, 2014). My pre-treatment units are 1993-2003 and 2008-2014 are post-treatment units. The treatmentperiod is between 2003 and 2008 hence I am using 2008 as my treatment period. The treated unit is unit 1.

    The synth command I ran for my outcome variable age1birth (age at first birth):

    synth age1birth inschool highestyeared edsingleyears age age1marr pregnevermarr marrneverpreg evermarried everpregnant evertested age1birth(1993) age1birth(1998) age1birth(2003), trunit(1) trperiod(2008) fig
    Here is the stata output for the command:

    . synth age1birth inschool highestyeared edsingleyears age age1marr pregnevermarr marrneverpreg everma
    > rried everpregnant evertested age1birth(1993) age1birth(1998) age1birth(2003), trunit(1) trperiod(20
    > 08) fig //BEST
    Synthetic Control Method for Comparative Case Studies
    First Step: Data Setup
    control units: for 2 of out 2 units missing obs for predictor evertested in period 1993 -ignored for a
    > veraging
    treated unit: for 1 of out 1 units missing obs for predictor evertested in period 1993 -ignored for av
    > eraging
    Data Setup successful
                    Treated Unit: bungoma
                   Control Units: busia, kakamega
              Dependent Variable: age1birth
      MSPE minimized for periods: 1993 1998 2003
    Results obtained for periods: 1993 1998 2003 2008 2014
                      Predictors: inschool highestyeared edsingleyears age age1marr pregnevermarr
                                  marrneverpreg evermarried everpregnant evertested age1birth(1993)
                                  age1birth(1998) age1birth(2003)
    Unless period is specified
    predictors are averaged over: 1993 1998 2003
    Second Step: Run Optimization
    Optimization done
    Third Step: Obtain Results
    Loss: Root Mean Squared Prediction Error
       RMSPE |  .4353367 
    Unit Weights:
        Co_No | Unit_Weight
        busia |         .34
     kakamega |         .66
    Predictor Balance:
                                   |   Treated  Synthetic 
                          inschool |  .3921929   .3514998 
                     highestyeared |  5.372596    5.27328 
                     edsingleyears |  7.351375   6.917235 
                               age |  19.25699   18.94241 
                          age1marr |  17.64313   17.53984 
                     pregnevermarr |  .0714234   .0922194 
                     marrneverpreg |  .0400861   .0424411 
                       evermarried |   .415141   .3941894 
                      everpregnant |  .4464783   .4439676 
                        evertested |  .0751843   .0947261 
                   age1birth(1993) |  18.05814   17.52789 
                   age1birth(1998) |  18.48276   18.02413 
                   age1birth(2003) |     17.98   17.70164 
    end of do-file
    Now lets say I would like to run this basic diff-in-diff regression:

    reg age1birth treatment post impact, r
    How do I include the sc weights in the diff in diff regression. The output tells me how to form the control group out of my donor pool (busia 0.34 & kakamega 0.66) and I now want to run a diff in diff put with these weighted controls. I found some sort of code for R to run synthetic diff in diffs but since I am not very familiar with R it did not help a lot, but it might help you ( Maybe there is a direct way to run a sdid in stata?

    Many thanks for your help and I am happy to go more in detail if needed for an answer.


    g W = 1
    replace W = 0.34 if unit == "busia"  // i use unit since i don't know the variable name for what "busia" is
    replace W = 0.66 if unit=="kakamega"
    reg age1birth treatment post impact [aw=W], r


      Thank you George! Sounds plausible, I guess I will go on with that idea. I thought there might be a specific synthetic command I could use but I think since I am working with stata 16.1 the new commands are not available to me yet.

      Do you have any idea if I need to include the variables (predictors) I used to create the SCs in my regression as controls or if I can leave these out and only add controls that have not been used to create the SCs?

      Many thanks for your thoughts.



        I'm no expert, but I'd try it with and without. I've toyed with this approach before, but was uncomfortable matching on the outcome and then running a regression. Might be better the CEM on the Xs.


          Hi Anja,

          I am also looking for a way to run a SDID in STATA. I was wondering if you have gathered further insides in how that could work?

          Thank you very much.



            Hey Benedikt Franz , here's some code given to me courtesy of Ariel Linden from a paper he wrote. Essentially, you use your SCM weights to "weight" your sample according to those that synth gives you, and use said weights in a parametric regression model. Note, that this is NOT the same as the approach by Athey, Imbens and co-authors, which imposes weight restrictions on our units and time periods.
            * Cigsales data is available with the itsa package
            use cigsale.dta
            // or, if you don't have this dataset...
            // use this u "", clear
            * Synth weights from Abadie et al 2010
            gen wtpaper = 1 if state==3 // CA
            replace wtpaper = 0.164 if state==4 // CO
            replace wtpaper = 0.069 if state==5 // CT
            replace wtpaper = 0.199 if state==19 // MT
            replace wtpaper = 0.234 if state ==21 // NV
            replace wtpaper = 0.334 if state ==34 // UT
            label var wtpaper "synth wts from Abadie 2010"
            * Run synth with cigsale (produces different weights than those in the paper)
            synth cigsale beer lnincome retprice age15to24 cigsale(1988) cigsale(1980) cigsale(1975), ///
            trunit(3) trperiod(1989) nested xperiod(1970(1)1988) fig
            * Run ITSA for cigsale
            itsa cigsale [aw=wtpaper], treatid(3) trperiod(1989) lag(1) figure(xlabel(1970(5)2000) scheme(lean2) ylabel(, nogrid) xlabel(, nogrid) ///
            legend(off) title("") subtitle("") note("") xtitle(Year)) ///
            posttrend replace
            Note ITSA is written by Linden, but this applies to any regression model. Note that I've not tested this.


              Hey Jared Greathouse, thank you so much for your helpful answer. I will have a look at it and try to implement.


                Dear George Ford, may I ask why you suggested analytic weights in #4? Not at all criticizing nor questioning, simply curious?


