Writing a regression equation for a triple difference framework

Will Fannin

Join Date: Apr 2021
Posts: 10

Writing a regression equation for a triple difference framework

04 Apr 2024, 14:54

Hello, I am having trouble writing a regression equation for a series of comparisons of means. Here is some sample data:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(year y target policy post)
2015  55 0 0 0
2015  57 0 0 0
2016  58 0 0 0
2016  44 0 0 0
2017  46 0 0 1
2017  59 0 0 1
2015 100 0 1 0
2015 100 0 1 0
2016 150 0 1 0
2016 156 0 1 0
2017 189 0 1 1
2017 184 0 1 1
2015  44 0 0 0
2015  32 0 0 0
2016  57 0 0 0
2016  66 0 0 0
2017  56 0 0 1
2017  59 0 0 1
2015 120 0 1 0
2015 130 0 1 0
2016 123 0 1 0
2016 156 0 1 0
2017 190 0 1 1
2017 188 0 1 1
2016 100 1 1 0
2016 105 1 1 0
2017 104 1 1 1
2017 103 1 1 1
2016  70 1 0 0
2016  72 1 0 0
2017  71 1 0 1
2017  76 1 0 1
end

I have a pre and post-policy period, policy and nonpolicy states, and targeted and nontargeted individuals. Note that the targeted individuals are only observed in the post-policy period.

I first create a "placebo" diff-in-diff value using only the "nontargeted" units.

Code:

sum y if policy==1 & post==1 & target==0
scalar t1post=r(mean)
sum y if policy==1 & post==0 & target==0
scalar t1pre=r(mean)

scalar t1diff=t1post-t1pre

sum y if policy==0 & post==1 & target==0
scalar c1post=r(mean)
sum y if policy==0 & post==0 & target==0
scalar c1pre=r(mean)

scalar c1diff=c1post-c1pre

scalar placebodd=t1diff-c1diff


di placebodd

di placebodd
55

Which can easily be replicated in a regression framework (represented by the interaction term):

Code:

di placebodd

reg y i.post##i.policy if target==0

Next, I compare differences among targeted individuals and subtract the "placebo" DD :

Code:

sum y if policy==1 & post==1 & target==1
scalar t2post=r(mean)
sum y if policy==0 & post==1 & target==1
scalar c2post=r(mean)

scalar t2diff=t2post-c2post

scalar ddd=t2diff-placebodd

di ddd

. di ddd
-25

Now, my question is how do I write a regression equation that will yield a coefficient representing this scalar ddd (-25)?

Tags: None

Andrew Musau

Join Date: Oct 2014
Posts: 10089

05 Apr 2024, 02:22

Just run a regression with triple interactions and then use margins and lincom. Note that including the option -coeflegend- in the margins command will tell you how to refer to the coefficients in the margins output.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(year y target policy post)
2015  55 0 0 0
2015  57 0 0 0
2016  58 0 0 0
2016  44 0 0 0
2017  46 0 0 1
2017  59 0 0 1
2015 100 0 1 0
2015 100 0 1 0
2016 150 0 1 0
2016 156 0 1 0
2017 189 0 1 1
2017 184 0 1 1
2015  44 0 0 0
2015  32 0 0 0
2016  57 0 0 0
2016  66 0 0 0
2017  56 0 0 1
2017  59 0 0 1
2015 120 0 1 0
2015 130 0 1 0
2016 123 0 1 0
2016 156 0 1 0
2017 190 0 1 1
2017 188 0 1 1
2016 100 1 1 0
2016 105 1 1 0
2017 104 1 1 1
2017 103 1 1 1
2016  70 1 0 0
2016  72 1 0 0
2017  71 1 0 1
2017  76 1 0 1
end

regress y i.policy##i.post##i.target, robust
margins  policy#post#target, post
lincom (_b[1.policy#1.post#1.target]- _b[0bn.policy#1.post#1.target]) - ///
((_b[1.policy#1.post#0bn.target]-_b[1.policy#0bn.post#0bn.target])- ///
(_b[0bn.policy#1.post#0bn.target]-_b[0bn.policy#0bn.post#0bn.target]))

Res.:

Code:

. regress y i.policy##i.post##i.target, robust

Linear regression                               Number of obs     =         32
                                                F(7, 24)          =     899.00
                                                Prob > F          =     0.0000
                                                R-squared         =     0.9351
                                                Root MSE          =     13.961

------------------------------------------------------------------------------------
                   |               Robust
                 y | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------------+----------------------------------------------------------------
          1.policy |      77.75   9.699388     8.02   0.000     57.73145    97.76855
            1.post |      3.375   5.151608     0.66   0.519    -7.257396     14.0074
                   |
       policy#post |
              1 1  |         55   10.26193     5.36   0.000     33.82041    76.17959
                   |
          1.target |     19.375   4.207818     4.60   0.000     10.69049    28.05951
                   |
     policy#target |
              1 1  |     -46.25   9.945424    -4.65   0.000    -66.77635   -25.72365
                   |
       post#target |
              1 1  |      -.875   5.601107    -0.16   0.877    -12.43512    10.68512
                   |
policy#post#target |
            1 1 1  |      -56.5   10.69925    -5.28   0.000    -78.58217   -34.41783
                   |
             _cons |     51.625    4.12784    12.51   0.000     43.10556    60.14444
------------------------------------------------------------------------------------

. margins  policy#post#target, post

Adjusted predictions                                        Number of obs = 32
Model VCE: Robust

Expression: Linear prediction, predict()

------------------------------------------------------------------------------------
                   |            Delta-method
                   |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------------+----------------------------------------------------------------
policy#post#target |
            0 0 0  |     51.625    4.12784    12.51   0.000     43.10556    60.14444
            0 0 1  |         71   .8164966    86.96   0.000     69.31483    72.68517
            0 1 0  |         55   3.082207    17.84   0.000     48.63864    61.36136
            0 1 1  |       73.5   2.041241    36.01   0.000     69.28708    77.71292
            1 0 0  |    129.375    8.77719    14.74   0.000     111.2598    147.4902
            1 0 1  |      102.5   2.041241    50.21   0.000     98.28708    106.7129
            1 1 0  |     187.75   1.314978   142.78   0.000      185.036     190.464
            1 1 1  |      103.5   .4082483   253.52   0.000     102.6574    104.3426
------------------------------------------------------------------------------------

. lincom (_b[1.policy#1.post#1.target]- _b[0bn.policy#1.post#1.target]) - ///
> ((_b[1.policy#1.post#0bn.target]-_b[1.policy#0bn.post#0bn.target])- ///
> (_b[0bn.policy#1.post#0bn.target]-_b[0bn.policy#0bn.post#0bn.target]))

 ( 1)  - 0bn.policy#0bn.post#0bn.target + 0bn.policy#1.post#0bn.target - 0bn.policy#1.post#1.target +
       1.policy#0bn.post#0bn.target - 1.policy#1.post#0bn.target + 1.policy#1.post#1.target = 0

------------------------------------------------------------------------------
             | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |        -25   10.47094    -2.39   0.025    -46.61096   -3.389038
------------------------------------------------------------------------------

Last edited by Andrew Musau; 05 Apr 2024, 02:35.

Comment

Will Fannin

Join Date: Apr 2021
Posts: 10

05 Apr 2024, 11:01

Andrew Musau

Ah thank you! What if I wanted to add geography and time fixed effects?

Apologies, but I had an error in the sample code. Here is the new data:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(year y target policy post geo)
2015  55 0 0 0 1
2015  57 0 0 0 1
2016  58 0 0 0 1
2016  44 0 0 0 1
2017  46 0 0 1 1
2017  59 0 0 1 1
2015 100 0 1 0 2
2015 100 0 1 0 2
2016 150 0 1 0 2
2016 156 0 1 0 2
2017 189 0 1 1 2
2017 184 0 1 1 2
2015  44 0 0 0 1
2015  32 0 0 0 1
2016  57 0 0 0 1
2016  66 0 0 0 1
2017  56 0 0 1 1
2017  59 0 0 1 1
2015 120 0 1 0 2
2015 130 0 1 0 2
2016 123 0 1 0 2
2016 156 0 1 0 2
2017 190 0 1 1 2
2017 188 0 1 1 2
2017 104 1 1 1 2
2017 103 1 1 1 2
2017  71 1 0 1 1
2017  76 1 0 1 1
end

I have the following DD set up:

Code:

reg y i.post##i.policy if target==0

Output:

Code:

. reg y i.post##i.policy if target==0

      Source |       SS           df       MS      Number of obs   =        24
-------------+----------------------------------   F(3, 20)        =     92.48
       Model |  64509.4583         3  21503.1528   Prob > F        =    0.0000
    Residual |      4650.5        20     232.525   R-squared       =    0.9328
-------------+----------------------------------   Adj R-squared   =    0.9227
       Total |  69159.9583        23  3006.95471   Root MSE        =    15.249

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      1.post |      3.375   9.337927     0.36   0.722    -16.10357    22.85357
    1.policy |      77.75   7.624385    10.20   0.000     61.84581    93.65419
             |
 post#policy |
        1 1  |         55   13.20582     4.16   0.000     27.45314    82.54686
             |
       _cons |     51.625   5.391254     9.58   0.000     40.37904    62.87096
------------------------------------------------------------------------------

And the triple difference:

Code:

regress y i.policy##i.post##i.target
margins  policy#post#target, post
lincom (_b[1.policy#1.post#1.target]- _b[0bn.policy#1.post#1.target]) - ///
((_b[1.policy#1.post#0bn.target]-_b[1.policy#0bn.post#0bn.target])- ///
(_b[0bn.policy#1.post#0bn.target]-_b[0bn.policy#0bn.post#0bn.target]))

Output:

Code:

. regress y i.policy##i.post##i.target
note: 0b.post#1.target identifies no observations in the sample.
note: 1.post#1.target omitted because of collinearity.
note: 0b.policy#0b.post#1.target identifies no observations in the sample.
note: 1.policy#0b.post#1.target identifies no observations in the sample.
note: 1.policy#1.post#1.target omitted because of collinearity.

      Source |       SS           df       MS      Number of obs   =        28
-------------+----------------------------------   F(5, 22)        =     62.20
       Model |  65927.4643         5  13185.4929   Prob > F        =    0.0000
    Residual |      4663.5        22  211.977273   R-squared       =    0.9339
-------------+----------------------------------   Adj R-squared   =    0.9189
       Total |  70590.9643        27  2614.48016   Root MSE        =    14.559

------------------------------------------------------------------------------------
                 y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------------+----------------------------------------------------------------
          1.policy |      77.75    7.27972    10.68   0.000     62.65279    92.84721
            1.post |      3.375   8.915799     0.38   0.709    -15.11524    21.86524
                   |
       policy#post |
              1 1  |         55   12.60884     4.36   0.000     28.85086    81.14914
                   |
          1.target |       18.5   12.60884     1.47   0.156    -7.649143    44.64914
                   |
     policy#target |
              1 1  |    -102.75    17.8316    -5.76   0.000    -139.7305   -65.76953
                   |
       post#target |
              0 1  |          0  (empty)
              1 1  |          0  (omitted)
                   |
policy#post#target |
            0 0 1  |          0  (empty)
            1 0 1  |          0  (empty)
            1 1 1  |          0  (omitted)
                   |
             _cons |     51.625   5.147539    10.03   0.000     40.94966    62.30034
------------------------------------------------------------------------------------

. margins  policy#post#target, post

Adjusted predictions                                        Number of obs = 28
Model VCE: OLS

Expression: Linear prediction, predict()

------------------------------------------------------------------------------------
                   |            Delta-method
                   |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------------+----------------------------------------------------------------
policy#post#target |
            0 0 0  |     51.625   5.147539    10.03   0.000     40.94966    62.30034
            0 0 1  |          .  (not estimable)
            0 1 0  |         55    7.27972     7.56   0.000     39.90279    70.09721
            0 1 1  |       73.5   10.29508     7.14   0.000     52.14931    94.85069
            1 0 0  |    129.375   5.147539    25.13   0.000     118.6997    140.0503
            1 0 1  |          .  (not estimable)
            1 1 0  |     187.75    7.27972    25.79   0.000     172.6528    202.8472
            1 1 1  |      103.5   10.29508    10.05   0.000     82.14931    124.8507
------------------------------------------------------------------------------------

. lincom (_b[1.policy#1.post#1.target]- _b[0bn.policy#1.post#1.target]) - ///
> ((_b[1.policy#1.post#0bn.target]-_b[1.policy#0bn.post#0bn.target])- ///
> (_b[0bn.policy#1.post#0bn.target]-_b[0bn.policy#0bn.post#0bn.target]))

 ( 1)  - 0bn.policy#0bn.post#0bn.target + 0bn.policy#1.post#0bn.target - 0bn.policy#1.post#1.target +
       1.policy#0bn.post#0bn.target - 1.policy#1.post#0bn.target + 1.policy#1.post#1.target = 0

------------------------------------------------------------------------------
             | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |        -25   19.26033    -1.30   0.208    -64.94348    14.94348
------------------------------------------------------------------------------

Now let's say I wanted to add geographic and time fixed effects:

Code:

ssc install reghdfe, replace
reghdfe y i.policy##i.post##i.target, absorb(year geo)

Output:

Code:

. reghdfe y i.policy##i.post##i.target, absorb(year geo)
(MWFE estimator converged in 2 iterations)
note: 1bn.policy is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e
> -09)
note: 1bn.post is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e-0
> 9)
note: 0b.post#0b.target omitted because of collinearity
note: 0b.policy#0b.post#0b.target omitted because of collinearity

HDFE Linear regression                            Number of obs   =         28
Absorbing 2 HDFE groups                           F(   3,     21) =      26.51
                                                  Prob > F        =     0.0000
                                                  R-squared       =     0.9601
                                                  Adj R-squared   =     0.9487
                                                  Within R-sq.    =     0.7911
                                                  Root MSE        =    11.5769

------------------------------------------------------------------------------------
                 y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------------+----------------------------------------------------------------
          1.policy |          0  (omitted)
            1.post |          0  (omitted)
                   |
       policy#post |
              1 1  |         55   10.02586     5.49   0.000     34.15008    75.84992
                   |
          1.target |       18.5   10.02586     1.85   0.079    -2.349916    39.34992
                   |
     policy#target |
              1 1  |    -102.75   14.17871    -7.25   0.000    -132.2362   -73.26377
                   |
       post#target |
              0 1  |          0  (empty)
              1 1  |          0  (omitted)
                   |
policy#post#target |
            0 0 1  |          0  (empty)
            1 0 1  |          0  (empty)
            1 1 1  |          0  (omitted)
                   |
             _cons |   91.94643   3.229222    28.47   0.000     85.23089    98.66196
------------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
        year |         3           0           3     |
         geo |         2           1           1     |
-----------------------------------------------------+

How do I yield the DDD coefficient in this case?

Comment

Andrew Musau

Join Date: Oct 2014

Posts: 10089
#4

05 Apr 2024, 12:24

What you are interested in these applications is the treatment effect. Forget about the post and treat variables and simply define a single treatment indicator (=1 if individual \(i\) at time \(t\) is subject to the treatment, and zero otherwise). Then use xtreg, fe or xtdidregress to estimate the treatment effect. See https://www.stata.com/stata17/differ...ences-DID-DDD/ for some discussion.
1 like
Comment

Announcement

Writing a regression equation for a triple difference framework

Comment

Comment

Comment