Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to run DID after psmatch2

    Dear all
    I need to run a difference-in-difference analysis using propensity score matching. I have two groups of treated and untreated firms. The event year is 1991, therefore I match treated and untreated firms by some criteria in the year 1990, which is before the event year. After using psmatch2 (mahalanobis) , I have different pairs of ID of treated and untreated firms matching by criteria in year 1990.

    Now I need to run the difference-in-difference regression (Treated vs untreated firms and before 1991 vs after 1991). In other words, I need two dummy variables: treat (=0 for untreated and =1 for treated firm) and post (=0 for before 1991 and =1 for after 1991). However, I don't know how to link the results from psmatch2 into my D-I-D regression. Can you please help me with this?

    Thank you so much

  • #2
    I haven't used psmatch2 before, but does this answer your question? http://www.stata.com/statalist/archi.../msg00440.html

    Comment


    • #3
      Dear Christos

      Thank you for your help. But it is not what I'm looking for. I already finished with the matching (which is only based on year 1990), but do not know how to use this in the whole sample.

      Comment


      • #4
        You can take the difference in the outcome variable between every pair, before and after the treatment. Then regress that on the "post" dummy. This is exactly differences in differences. The first step takes the difference between the treated and control group, but in your case you are taking that difference between matched pairs. The second step takes the difference across time periods.

        The following example with simulated data shows that this approach and a standard diff in diff regression yield the same estimate.

        Code:
        clear
        set seed 951
        * Generate some example data
        * Number of obs
        set obs 100
        * Control, period 0
        gen xc0=1+uniform()
        * Control, period 1
        gen xc1=1+uniform()
        * Treatment, period 0
        gen xt0=0.5+uniform()
        * Treatment, period 1
        gen xt1=1+uniform()
        * From this, treatment effect is 0.5
        * End of data generation
        
        * Approach 1: Diff in diff regression
        preserve
        * Convert into long dataset
        gen id=_n
        reshape long x, i(id) j(type) string
        * Generate treat and post dummies
        gen t = inlist(type,"t1","t0")
        gen post = inlist(type,"t1","c1")
        * Generate interaction
        gen theta=t*post
        * DID regression
        reg x t post theta, cluster(id)
        est store approach1
        restore
        
        * Approach 2: Time regression on paired differences
        
        preserve
        * Generate differences across pairs
        gen d0=xt0-xc0
        gen d1=xt1-xc1
        * Convert to long dataset
        gen id=_n
        reshape long xt xc d, i(id) j(t)
        * Generate post dummy
        gen theta= (t==1)
        * Regression
        reg d theta, cluster(id)
        est store approach2
        restore
        
        * Approaches yield the same results
        est tab *, b se keep(theta)
        Jorge Eduardo Pérez Pérez
        www.jorgeperezperez.com

        Comment


        • #5
          I am not an expert, but have used _weight variable as weights in the dif-dif regression. Something like the following if you have data in wide form.

          sysuse auto, clear

          *generate some variables
          gen rep80=rep78+runiform()*2 //post treatment variable
          gen treated=foreign //treatment variable

          *matching on pretreatment variable and possibly on other x-variables
          psmatch2 treated , outcome(rep80) mahalanobis(rep78 turn)

          *generate difference
          gen dif80_78=rep80-rep78

          *dif-dif regression on mathced sample (uses _weight variable that psmatch2 generates)
          reg dif80_78 treated turn [aw=_weight]

          Comment


          • #6
            Andres Vork This approach will not work here. psmatch2 will generate inverse probability weights, which may be used as weights in regression, after propensity score matching. For nearest neighbor matching, _weight will be equal to the number of controls per treated observation. You can see that in you own example by tabulating the _weight variable. In Mia Pham 's design, with paired observations after matching, there would be one control variable per treated variable, and _weight will be 1 in all cases.
            Jorge Eduardo Pérez Pérez
            www.jorgeperezperez.com

            Comment


            • #7
              Jorge Eduardo Perez Perez How will that work if the outcome variable is a dummy, how does the interpretation of the treatment effect change given that the difference will hae -1, 0 and 1 values?

              Comment


              • #8
                You can think about the whole model as a linear probability model, then the coefficients would be the effects of the independent variables on the probability of Y=1.
                Jorge Eduardo Pérez Pérez
                www.jorgeperezperez.com

                Comment

                Working...
                X