Using Synthetic Control Weights in a Difference in Differences model (SDID)

Anja Jean-Mairet

Join Date: Apr 2021
Posts: 18

Using Synthetic Control Weights in a Difference in Differences model (SDID)

19 Jul 2021, 06:38

Hi everyone,

I am currently conducting a synthetic control (sc) model and would now like to use my sc's in a diff in diff, hence a synthetic diff in diff (sdid). I cannot find code or help on how to run this in stata and am now looking for help here. Further on I am presenting one of my sc regressions and the output I received so you get a better understanding of my data.
I am using panel data which I collapsed to district level. I have three units (districts) and 5 time periods (1993, 1998, 2003, 2008, 2014). My pre-treatment units are 1993-2003 and 2008-2014 are post-treatment units. The treatmentperiod is between 2003 and 2008 hence I am using 2008 as my treatment period. The treated unit is unit 1.

The synth command I ran for my outcome variable age1birth (age at first birth):

Code:

synth age1birth inschool highestyeared edsingleyears age age1marr pregnevermarr marrneverpreg evermarried everpregnant evertested age1birth(1993) age1birth(1998) age1birth(2003), trunit(1) trperiod(2008) fig

Here is the stata output for the command:

HTML Code:

. synth age1birth inschool highestyeared edsingleyears age age1marr pregnevermarr marrneverpreg everma
> rried everpregnant evertested age1birth(1993) age1birth(1998) age1birth(2003), trunit(1) trperiod(20
> 08) fig //BEST
------------------------------------------------------------------------------------------------------
Synthetic Control Method for Comparative Case Studies
------------------------------------------------------------------------------------------------------

First Step: Data Setup
------------------------------------------------------------------------------------------------------
control units: for 2 of out 2 units missing obs for predictor evertested in period 1993 -ignored for a
> veraging
treated unit: for 1 of out 1 units missing obs for predictor evertested in period 1993 -ignored for av
> eraging
------------------------------------------------------------------------------------------------------
Data Setup successful
------------------------------------------------------------------------------------------------------
                Treated Unit: bungoma
               Control Units: busia, kakamega
------------------------------------------------------------------------------------------------------
          Dependent Variable: age1birth
  MSPE minimized for periods: 1993 1998 2003
Results obtained for periods: 1993 1998 2003 2008 2014
------------------------------------------------------------------------------------------------------
                  Predictors: inschool highestyeared edsingleyears age age1marr pregnevermarr
                              marrneverpreg evermarried everpregnant evertested age1birth(1993)
                              age1birth(1998) age1birth(2003)
------------------------------------------------------------------------------------------------------
Unless period is specified
predictors are averaged over: 1993 1998 2003
------------------------------------------------------------------------------------------------------

Second Step: Run Optimization
------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------
Optimization done
------------------------------------------------------------------------------------------------------

Third Step: Obtain Results
------------------------------------------------------------------------------------------------------
Loss: Root Mean Squared Prediction Error

---------------------
   RMSPE |  .4353367 
---------------------
------------------------------------------------------------------------------------------------------
Unit Weights:

-----------------------
    Co_No | Unit_Weight
----------+------------
    busia |         .34
 kakamega |         .66
-----------------------
------------------------------------------------------------------------------------------------------
Predictor Balance:

------------------------------------------------------
                               |   Treated  Synthetic 
-------------------------------+----------------------
                      inschool |  .3921929   .3514998 
                 highestyeared |  5.372596    5.27328 
                 edsingleyears |  7.351375   6.917235 
                           age |  19.25699   18.94241 
                      age1marr |  17.64313   17.53984 
                 pregnevermarr |  .0714234   .0922194 
                 marrneverpreg |  .0400861   .0424411 
                   evermarried |   .415141   .3941894 
                  everpregnant |  .4464783   .4439676 
                    evertested |  .0751843   .0947261 
               age1birth(1993) |  18.05814   17.52789 
               age1birth(1998) |  18.48276   18.02413 
               age1birth(2003) |     17.98   17.70164 
------------------------------------------------------
------------------------------------------------------------------------------------------------------

. 
end of do-file

Now lets say I would like to run this basic diff-in-diff regression:

Code:

reg age1birth treatment post impact, r

How do I include the sc weights in the diff in diff regression. The output tells me how to form the control group out of my donor pool (busia 0.34 & kakamega 0.66) and I now want to run a diff in diff put with these weighted controls. I found some sort of code for R to run synthetic diff in diffs but since I am not very familiar with R it did not help a lot, but it might help you (https://github.com/synth-inference/s.../tree/master/R). Maybe there is a direct way to run a sdid in stata?

Many thanks for your help and I am happy to go more in detail if needed for an answer.

Best,
Anja

Tags: None

George Ford

Join Date: Aug 2014
Posts: 3152

21 Jul 2021, 16:25

perhaps?

Code:

g W = 1
replace W = 0.34 if unit == "busia"  // i use unit since i don't know the variable name for what "busia" is
replace W = 0.66 if unit=="kakamega"
reg age1birth treatment post impact [aw=W], r

Comment

Anja Jean-Mairet

Join Date: Apr 2021

Posts: 18
#3

22 Jul 2021, 01:08

Thank you George! Sounds plausible, I guess I will go on with that idea. I thought there might be a specific synthetic command I could use but I think since I am working with stata 16.1 the new commands are not available to me yet.

Do you have any idea if I need to include the variables (predictors) I used to create the SCs in my regression as controls or if I can leave these out and only add controls that have not been used to create the SCs?

Many thanks for your thoughts.

Best,
Anja
Comment
George Ford

Join Date: Aug 2014

Posts: 3152
#4

22 Jul 2021, 07:04

I'm no expert, but I'd try it with and without. I've toyed with this approach before, but was uncomfortable matching on the outcome and then running a regression. Might be better the CEM on the Xs.
Comment
Benedikt Franz

Join Date: May 2021

Posts: 19
#5

21 Mar 2022, 10:21

Hi Anja,

I am also looking for a way to run a SDID in STATA. I was wondering if you have gathered further insides in how that could work?

Thank you very much.

Best,
Benedikt
Comment

Jared Greathouse

Join Date: Sep 2021
Posts: 2170

21 Mar 2022, 12:38

Hey Benedikt Franz , here's some code given to me courtesy of Ariel Linden from a paper he wrote. Essentially, you use your SCM weights to "weight" your sample according to those that synth gives you, and use said weights in a parametric regression model. Note, that this is NOT the same as the approach by Athey, Imbens and co-authors, which imposes weight restrictions on our units and time periods.

Code:

* Cigsales data is available with the itsa package

use cigsale.dta
// or, if you don't have this dataset...
// use this u "https://github.com/scunning1975/mixtape/blob/master/smoking.dta?raw=true", clear

* Synth weights from Abadie et al 2010

gen wtpaper = 1 if state==3 // CA

replace wtpaper = 0.164 if state==4 // CO

replace wtpaper = 0.069 if state==5 // CT

replace wtpaper = 0.199 if state==19 // MT

replace wtpaper = 0.234 if state ==21 // NV

replace wtpaper = 0.334 if state ==34 // UT

label var wtpaper "synth wts from Abadie 2010"

 

 

* Run synth with cigsale (produces different weights than those in the paper)

synth cigsale beer lnincome retprice age15to24 cigsale(1988) cigsale(1980) cigsale(1975), ///

trunit(3) trperiod(1989) nested xperiod(1970(1)1988) fig

 

* Run ITSA for cigsale

itsa cigsale [aw=wtpaper], treatid(3) trperiod(1989) lag(1) figure(xlabel(1970(5)2000) scheme(lean2) ylabel(, nogrid) xlabel(, nogrid) ///

legend(off) title("") subtitle("") note("") xtitle(Year)) ///

posttrend replace

Note ITSA is written by Linden, but this applies to any regression model. Note that I've not tested this.

Comment

Benedikt Franz

Join Date: May 2021

Posts: 19
#7

25 Mar 2022, 13:17

Hey Jared Greathouse, thank you so much for your helpful answer. I will have a look at it and try to implement.
Comment
Maxence Morlet

Join Date: Mar 2021

Posts: 653
#8

25 Mar 2022, 15:50

Dear George Ford, may I ask why you suggested analytic weights in #4? Not at all criticizing nor questioning, simply curious?
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#9

25 Mar 2022, 17:33

This seems like the answer? Maxence Morlet
Comment

Announcement

Using Synthetic Control Weights in a Difference in Differences model (SDID)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment