How to run DID after psmatch2

Mia Pham

Join Date: Mar 2015

Posts: 44
#1

How to run DID after psmatch2

23 Aug 2015, 22:15

Dear all
I need to run a difference-in-difference analysis using propensity score matching. I have two groups of treated and untreated firms. The event year is 1991, therefore I match treated and untreated firms by some criteria in the year 1990, which is before the event year. After using psmatch2 (mahalanobis) , I have different pairs of ID of treated and untreated firms matching by criteria in year 1990.

Now I need to run the difference-in-difference regression (Treated vs untreated firms and before 1991 vs after 1991). In other words, I need two dummy variables: treat (=0 for untreated and =1 for treated firm) and post (=0 for before 1991 and =1 for after 1991). However, I don't know how to link the results from psmatch2 into my D-I-D regression. Can you please help me with this?

Thank you so much
Tags: None
Christos Makridis

Join Date: Nov 2014

Posts: 157
#2

24 Aug 2015, 00:18

I haven't used psmatch2 before, but does this answer your question? http://www.stata.com/statalist/archi.../msg00440.html
Comment
Mia Pham

Join Date: Mar 2015

Posts: 44
#3

25 Aug 2015, 01:16

Dear Christos

Thank you for your help. But it is not what I'm looking for. I already finished with the matching (which is only based on year 1990), but do not know how to use this in the whole sample.
Comment

Jorge Eduardo Perez Perez

Join Date: Mar 2014
Posts: 429

25 Aug 2015, 05:00

You can take the difference in the outcome variable between every pair, before and after the treatment. Then regress that on the "post" dummy. This is exactly differences in differences. The first step takes the difference between the treated and control group, but in your case you are taking that difference between matched pairs. The second step takes the difference across time periods.

The following example with simulated data shows that this approach and a standard diff in diff regression yield the same estimate.

Code:

clear
set seed 951
* Generate some example data
* Number of obs
set obs 100
* Control, period 0
gen xc0=1+uniform()
* Control, period 1
gen xc1=1+uniform()
* Treatment, period 0
gen xt0=0.5+uniform()
* Treatment, period 1
gen xt1=1+uniform()
* From this, treatment effect is 0.5
* End of data generation

* Approach 1: Diff in diff regression
preserve
* Convert into long dataset
gen id=_n
reshape long x, i(id) j(type) string
* Generate treat and post dummies
gen t = inlist(type,"t1","t0")
gen post = inlist(type,"t1","c1")
* Generate interaction
gen theta=t*post
* DID regression
reg x t post theta, cluster(id)
est store approach1
restore

* Approach 2: Time regression on paired differences

preserve
* Generate differences across pairs
gen d0=xt0-xc0
gen d1=xt1-xc1
* Convert to long dataset
gen id=_n
reshape long xt xc d, i(id) j(t)
* Generate post dummy
gen theta= (t==1)
* Regression
reg d theta, cluster(id)
est store approach2
restore

* Approaches yield the same results
est tab *, b se keep(theta)

Jorge Eduardo Pérez Pérez
www.jorgeperezperez.com

Comment

Andres Vork

Join Date: Aug 2015

Posts: 1
#5

25 Aug 2015, 05:38

I am not an expert, but have used _weight variable as weights in the dif-dif regression. Something like the following if you have data in wide form.

sysuse auto, clear

*generate some variables
gen rep80=rep78+runiform()*2 //post treatment variable
gen treated=foreign //treatment variable

*matching on pretreatment variable and possibly on other x-variables
psmatch2 treated , outcome(rep80) mahalanobis(rep78 turn)

*generate difference
gen dif80_78=rep80-rep78

*dif-dif regression on mathced sample (uses _weight variable that psmatch2 generates)
reg dif80_78 treated turn [aw=_weight]
Comment
Jorge Eduardo Perez Perez

Join Date: Mar 2014

Posts: 429
#6

25 Aug 2015, 08:58

Andres Vork This approach will not work here. psmatch2 will generate inverse probability weights, which may be used as weights in regression, after propensity score matching. For nearest neighbor matching, _weight will be equal to the number of controls per treated observation. You can see that in you own example by tabulating the _weight variable. In Mia Pham 's design, with paired observations after matching, there would be one control variable per treated variable, and _weight will be 1 in all cases.

Jorge Eduardo Pérez Pérez
www.jorgeperezperez.com
Comment
Diana Contreras

Join Date: Jan 2017

Posts: 1
#7

23 Jan 2017, 18:18

Jorge Eduardo Perez Perez How will that work if the outcome variable is a dummy, how does the interpretation of the treatment effect change given that the difference will hae -1, 0 and 1 values?
Comment
Jorge Eduardo Perez Perez

Join Date: Mar 2014

Posts: 429
#8

25 Jan 2017, 12:36

You can think about the whole model as a linear probability model, then the coefficients would be the effects of the independent variables on the probability of Y=1.

Jorge Eduardo Pérez Pérez
www.jorgeperezperez.com
Comment

Announcement