Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Writing a regression equation for a triple difference framework

    Hello, I am having trouble writing a regression equation for a series of comparisons of means. Here is some sample data:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(year y target policy post)
    2015  55 0 0 0
    2015  57 0 0 0
    2016  58 0 0 0
    2016  44 0 0 0
    2017  46 0 0 1
    2017  59 0 0 1
    2015 100 0 1 0
    2015 100 0 1 0
    2016 150 0 1 0
    2016 156 0 1 0
    2017 189 0 1 1
    2017 184 0 1 1
    2015  44 0 0 0
    2015  32 0 0 0
    2016  57 0 0 0
    2016  66 0 0 0
    2017  56 0 0 1
    2017  59 0 0 1
    2015 120 0 1 0
    2015 130 0 1 0
    2016 123 0 1 0
    2016 156 0 1 0
    2017 190 0 1 1
    2017 188 0 1 1
    2016 100 1 1 0
    2016 105 1 1 0
    2017 104 1 1 1
    2017 103 1 1 1
    2016  70 1 0 0
    2016  72 1 0 0
    2017  71 1 0 1
    2017  76 1 0 1
    end

    I have a pre and post-policy period, policy and nonpolicy states, and targeted and nontargeted individuals. Note that the targeted individuals are only observed in the post-policy period.

    I first create a "placebo" diff-in-diff value using only the "nontargeted" units.

    Code:
    sum y if policy==1 & post==1 & target==0
    scalar t1post=r(mean)
    sum y if policy==1 & post==0 & target==0
    scalar t1pre=r(mean)
    
    scalar t1diff=t1post-t1pre
    
    sum y if policy==0 & post==1 & target==0
    scalar c1post=r(mean)
    sum y if policy==0 & post==0 & target==0
    scalar c1pre=r(mean)
    
    scalar c1diff=c1post-c1pre
    
    scalar placebodd=t1diff-c1diff
    
    
    di placebodd
    di placebodd
    55


    Which can easily be replicated in a regression framework (represented by the interaction term):

    Code:
    di placebodd
    
    reg y i.post##i.policy if target==0


    Next, I compare differences among targeted individuals and subtract the "placebo" DD :
    Code:
    sum y if policy==1 & post==1 & target==1
    scalar t2post=r(mean)
    sum y if policy==0 & post==1 & target==1
    scalar c2post=r(mean)
    
    scalar t2diff=t2post-c2post
    
    scalar ddd=t2diff-placebodd
    
    di ddd

    . di ddd
    -25


    Now, my question is how do I write a regression equation that will yield a coefficient representing this scalar ddd (-25)?


  • #2
    Just run a regression with triple interactions and then use margins and lincom. Note that including the option -coeflegend- in the margins command will tell you how to refer to the coefficients in the margins output.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(year y target policy post)
    2015  55 0 0 0
    2015  57 0 0 0
    2016  58 0 0 0
    2016  44 0 0 0
    2017  46 0 0 1
    2017  59 0 0 1
    2015 100 0 1 0
    2015 100 0 1 0
    2016 150 0 1 0
    2016 156 0 1 0
    2017 189 0 1 1
    2017 184 0 1 1
    2015  44 0 0 0
    2015  32 0 0 0
    2016  57 0 0 0
    2016  66 0 0 0
    2017  56 0 0 1
    2017  59 0 0 1
    2015 120 0 1 0
    2015 130 0 1 0
    2016 123 0 1 0
    2016 156 0 1 0
    2017 190 0 1 1
    2017 188 0 1 1
    2016 100 1 1 0
    2016 105 1 1 0
    2017 104 1 1 1
    2017 103 1 1 1
    2016  70 1 0 0
    2016  72 1 0 0
    2017  71 1 0 1
    2017  76 1 0 1
    end
    
    regress y i.policy##i.post##i.target, robust
    margins  policy#post#target, post
    lincom (_b[1.policy#1.post#1.target]- _b[0bn.policy#1.post#1.target]) - ///
    ((_b[1.policy#1.post#0bn.target]-_b[1.policy#0bn.post#0bn.target])- ///
    (_b[0bn.policy#1.post#0bn.target]-_b[0bn.policy#0bn.post#0bn.target]))
    Res.:

    Code:
    . regress y i.policy##i.post##i.target, robust
    
    Linear regression                               Number of obs     =         32
                                                    F(7, 24)          =     899.00
                                                    Prob > F          =     0.0000
                                                    R-squared         =     0.9351
                                                    Root MSE          =     13.961
    
    ------------------------------------------------------------------------------------
                       |               Robust
                     y | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------------+----------------------------------------------------------------
              1.policy |      77.75   9.699388     8.02   0.000     57.73145    97.76855
                1.post |      3.375   5.151608     0.66   0.519    -7.257396     14.0074
                       |
           policy#post |
                  1 1  |         55   10.26193     5.36   0.000     33.82041    76.17959
                       |
              1.target |     19.375   4.207818     4.60   0.000     10.69049    28.05951
                       |
         policy#target |
                  1 1  |     -46.25   9.945424    -4.65   0.000    -66.77635   -25.72365
                       |
           post#target |
                  1 1  |      -.875   5.601107    -0.16   0.877    -12.43512    10.68512
                       |
    policy#post#target |
                1 1 1  |      -56.5   10.69925    -5.28   0.000    -78.58217   -34.41783
                       |
                 _cons |     51.625    4.12784    12.51   0.000     43.10556    60.14444
    ------------------------------------------------------------------------------------
    
    . margins  policy#post#target, post
    
    Adjusted predictions                                        Number of obs = 32
    Model VCE: Robust
    
    Expression: Linear prediction, predict()
    
    ------------------------------------------------------------------------------------
                       |            Delta-method
                       |     Margin   std. err.      t    P>|t|     [95% conf. interval]
    -------------------+----------------------------------------------------------------
    policy#post#target |
                0 0 0  |     51.625    4.12784    12.51   0.000     43.10556    60.14444
                0 0 1  |         71   .8164966    86.96   0.000     69.31483    72.68517
                0 1 0  |         55   3.082207    17.84   0.000     48.63864    61.36136
                0 1 1  |       73.5   2.041241    36.01   0.000     69.28708    77.71292
                1 0 0  |    129.375    8.77719    14.74   0.000     111.2598    147.4902
                1 0 1  |      102.5   2.041241    50.21   0.000     98.28708    106.7129
                1 1 0  |     187.75   1.314978   142.78   0.000      185.036     190.464
                1 1 1  |      103.5   .4082483   253.52   0.000     102.6574    104.3426
    ------------------------------------------------------------------------------------
    
    . lincom (_b[1.policy#1.post#1.target]- _b[0bn.policy#1.post#1.target]) - ///
    > ((_b[1.policy#1.post#0bn.target]-_b[1.policy#0bn.post#0bn.target])- ///
    > (_b[0bn.policy#1.post#0bn.target]-_b[0bn.policy#0bn.post#0bn.target]))
    
     ( 1)  - 0bn.policy#0bn.post#0bn.target + 0bn.policy#1.post#0bn.target - 0bn.policy#1.post#1.target +
           1.policy#0bn.post#0bn.target - 1.policy#1.post#0bn.target + 1.policy#1.post#1.target = 0
    
    ------------------------------------------------------------------------------
                 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             (1) |        -25   10.47094    -2.39   0.025    -46.61096   -3.389038
    ------------------------------------------------------------------------------
    Last edited by Andrew Musau; 05 Apr 2024, 03:35.

    Comment


    • #3
      Andrew Musau

      Ah thank you! What if I wanted to add geography and time fixed effects?


      Apologies, but I had an error in the sample code. Here is the new data:

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float(year y target policy post geo)
      2015  55 0 0 0 1
      2015  57 0 0 0 1
      2016  58 0 0 0 1
      2016  44 0 0 0 1
      2017  46 0 0 1 1
      2017  59 0 0 1 1
      2015 100 0 1 0 2
      2015 100 0 1 0 2
      2016 150 0 1 0 2
      2016 156 0 1 0 2
      2017 189 0 1 1 2
      2017 184 0 1 1 2
      2015  44 0 0 0 1
      2015  32 0 0 0 1
      2016  57 0 0 0 1
      2016  66 0 0 0 1
      2017  56 0 0 1 1
      2017  59 0 0 1 1
      2015 120 0 1 0 2
      2015 130 0 1 0 2
      2016 123 0 1 0 2
      2016 156 0 1 0 2
      2017 190 0 1 1 2
      2017 188 0 1 1 2
      2017 104 1 1 1 2
      2017 103 1 1 1 2
      2017  71 1 0 1 1
      2017  76 1 0 1 1
      end
      I have the following DD set up:

      Code:
      reg y i.post##i.policy if target==0
      Output:
      Code:
      . reg y i.post##i.policy if target==0
      
            Source |       SS           df       MS      Number of obs   =        24
      -------------+----------------------------------   F(3, 20)        =     92.48
             Model |  64509.4583         3  21503.1528   Prob > F        =    0.0000
          Residual |      4650.5        20     232.525   R-squared       =    0.9328
      -------------+----------------------------------   Adj R-squared   =    0.9227
             Total |  69159.9583        23  3006.95471   Root MSE        =    15.249
      
      ------------------------------------------------------------------------------
                 y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
            1.post |      3.375   9.337927     0.36   0.722    -16.10357    22.85357
          1.policy |      77.75   7.624385    10.20   0.000     61.84581    93.65419
                   |
       post#policy |
              1 1  |         55   13.20582     4.16   0.000     27.45314    82.54686
                   |
             _cons |     51.625   5.391254     9.58   0.000     40.37904    62.87096
      ------------------------------------------------------------------------------
      And the triple difference:

      Code:
      regress y i.policy##i.post##i.target
      margins  policy#post#target, post
      lincom (_b[1.policy#1.post#1.target]- _b[0bn.policy#1.post#1.target]) - ///
      ((_b[1.policy#1.post#0bn.target]-_b[1.policy#0bn.post#0bn.target])- ///
      (_b[0bn.policy#1.post#0bn.target]-_b[0bn.policy#0bn.post#0bn.target]))
      Output:
      Code:
      . regress y i.policy##i.post##i.target
      note: 0b.post#1.target identifies no observations in the sample.
      note: 1.post#1.target omitted because of collinearity.
      note: 0b.policy#0b.post#1.target identifies no observations in the sample.
      note: 1.policy#0b.post#1.target identifies no observations in the sample.
      note: 1.policy#1.post#1.target omitted because of collinearity.
      
            Source |       SS           df       MS      Number of obs   =        28
      -------------+----------------------------------   F(5, 22)        =     62.20
             Model |  65927.4643         5  13185.4929   Prob > F        =    0.0000
          Residual |      4663.5        22  211.977273   R-squared       =    0.9339
      -------------+----------------------------------   Adj R-squared   =    0.9189
             Total |  70590.9643        27  2614.48016   Root MSE        =    14.559
      
      ------------------------------------------------------------------------------------
                       y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------------+----------------------------------------------------------------
                1.policy |      77.75    7.27972    10.68   0.000     62.65279    92.84721
                  1.post |      3.375   8.915799     0.38   0.709    -15.11524    21.86524
                         |
             policy#post |
                    1 1  |         55   12.60884     4.36   0.000     28.85086    81.14914
                         |
                1.target |       18.5   12.60884     1.47   0.156    -7.649143    44.64914
                         |
           policy#target |
                    1 1  |    -102.75    17.8316    -5.76   0.000    -139.7305   -65.76953
                         |
             post#target |
                    0 1  |          0  (empty)
                    1 1  |          0  (omitted)
                         |
      policy#post#target |
                  0 0 1  |          0  (empty)
                  1 0 1  |          0  (empty)
                  1 1 1  |          0  (omitted)
                         |
                   _cons |     51.625   5.147539    10.03   0.000     40.94966    62.30034
      ------------------------------------------------------------------------------------
      
      . margins  policy#post#target, post
      
      Adjusted predictions                                        Number of obs = 28
      Model VCE: OLS
      
      Expression: Linear prediction, predict()
      
      ------------------------------------------------------------------------------------
                         |            Delta-method
                         |     Margin   std. err.      t    P>|t|     [95% conf. interval]
      -------------------+----------------------------------------------------------------
      policy#post#target |
                  0 0 0  |     51.625   5.147539    10.03   0.000     40.94966    62.30034
                  0 0 1  |          .  (not estimable)
                  0 1 0  |         55    7.27972     7.56   0.000     39.90279    70.09721
                  0 1 1  |       73.5   10.29508     7.14   0.000     52.14931    94.85069
                  1 0 0  |    129.375   5.147539    25.13   0.000     118.6997    140.0503
                  1 0 1  |          .  (not estimable)
                  1 1 0  |     187.75    7.27972    25.79   0.000     172.6528    202.8472
                  1 1 1  |      103.5   10.29508    10.05   0.000     82.14931    124.8507
      ------------------------------------------------------------------------------------
      
      . lincom (_b[1.policy#1.post#1.target]- _b[0bn.policy#1.post#1.target]) - ///
      > ((_b[1.policy#1.post#0bn.target]-_b[1.policy#0bn.post#0bn.target])- ///
      > (_b[0bn.policy#1.post#0bn.target]-_b[0bn.policy#0bn.post#0bn.target]))
      
       ( 1)  - 0bn.policy#0bn.post#0bn.target + 0bn.policy#1.post#0bn.target - 0bn.policy#1.post#1.target +
             1.policy#0bn.post#0bn.target - 1.policy#1.post#0bn.target + 1.policy#1.post#1.target = 0
      
      ------------------------------------------------------------------------------
                   | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
               (1) |        -25   19.26033    -1.30   0.208    -64.94348    14.94348
      ------------------------------------------------------------------------------


      Now let's say I wanted to add geographic and time fixed effects:
      Code:
      ssc install reghdfe, replace
      reghdfe y i.policy##i.post##i.target, absorb(year geo)
      Output:
      Code:
      . reghdfe y i.policy##i.post##i.target, absorb(year geo)
      (MWFE estimator converged in 2 iterations)
      note: 1bn.policy is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e
      > -09)
      note: 1bn.post is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e-0
      > 9)
      note: 0b.post#0b.target omitted because of collinearity
      note: 0b.policy#0b.post#0b.target omitted because of collinearity
      
      HDFE Linear regression                            Number of obs   =         28
      Absorbing 2 HDFE groups                           F(   3,     21) =      26.51
                                                        Prob > F        =     0.0000
                                                        R-squared       =     0.9601
                                                        Adj R-squared   =     0.9487
                                                        Within R-sq.    =     0.7911
                                                        Root MSE        =    11.5769
      
      ------------------------------------------------------------------------------------
                       y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------------+----------------------------------------------------------------
                1.policy |          0  (omitted)
                  1.post |          0  (omitted)
                         |
             policy#post |
                    1 1  |         55   10.02586     5.49   0.000     34.15008    75.84992
                         |
                1.target |       18.5   10.02586     1.85   0.079    -2.349916    39.34992
                         |
           policy#target |
                    1 1  |    -102.75   14.17871    -7.25   0.000    -132.2362   -73.26377
                         |
             post#target |
                    0 1  |          0  (empty)
                    1 1  |          0  (omitted)
                         |
      policy#post#target |
                  0 0 1  |          0  (empty)
                  1 0 1  |          0  (empty)
                  1 1 1  |          0  (omitted)
                         |
                   _cons |   91.94643   3.229222    28.47   0.000     85.23089    98.66196
      ------------------------------------------------------------------------------------
      
      Absorbed degrees of freedom:
      -----------------------------------------------------+
       Absorbed FE | Categories  - Redundant  = Num. Coefs |
      -------------+---------------------------------------|
              year |         3           0           3     |
               geo |         2           1           1     |
      -----------------------------------------------------+
      How do I yield the DDD coefficient in this case?

      Comment


      • #4
        What you are interested in these applications is the treatment effect. Forget about the post and treat variables and simply define a single treatment indicator (=1 if individual \(i\) at time \(t\) is subject to the treatment, and zero otherwise). Then use xtreg, fe or xtdidregress to estimate the treatment effect. See https://www.stata.com/stata17/differ...ences-DID-DDD/ for some discussion.

        Comment

        Working...
        X