Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Parallel trends after -didregress- and -reghdfe-

    I am using -didregress- and -reghdfe- (ssc install) to run DID regression (Stata 17). Sample data given below (courtesy Data Hall Youtube channel)

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(patient after) float after2 byte treatment int bp byte(age sleep) float did
     1 -1 0 1 143 47 5 0
     1  0 1 1 153 49 5 0
     1  1 2 1 125 49 5 1
     2 -1 0 1 153 47 5 0
     2  0 1 1 146 44 7 0
     2  1 2 1 130 44 7 1
     3 -1 0 1 152 59 4 0
     3  0 1 1 150 52 4 0
     3  1 2 1 132 52 4 1
     4 -1 0 1 126 59 4 0
     4  0 1 1 148 50 5 0
     4  1 2 1 120 50 5 1
     5 -1 0 1 141 40 6 0
     5  0 1 1 153 53 5 0
     5  1 2 1 132 53 5 1
     6 -1 0 1 162 40 6 0
     6  0 1 1 153 44 6 0
     6  1 2 1 120 44 6 1
     7 -1 0 1 176 59 6 0
     7  0 1 1 158 51 6 0
     7  1 2 1 125 51 6 1
     8 -1 0 1 134 59 6 0
     8  0 1 1 149 56 6 0
     8  1 2 1 130 56 6 1
     9 -1 0 1 143 41 6 0
     9  0 1 1 173 49 7 0
    10  1 2 1 133 49 7 1
    10 -1 0 1 136 41 6 0
    11  0 1 1 165 48 4 0
    11  1 2 1 135 48 4 1
    11 -1 0 0 148 51 4 0
    12  0 1 0 158 59 8 0
    12  1 2 0 159 59 8 0
    12 -1 0 0 158 51 4 0
    13  0 1 0 151 47 5 0
    13  1 2 0 153 47 5 0
    13 -1 0 0 157 48 5 0
    14  0 1 0 155 59 4 0
    14  1 2 0 126 59 4 0
    14 -1 0 0 131 48 5 0
    15  0 1 0 153 40 6 0
    15  1 2 0 162 40 6 0
    15 -1 0 0 146 40 5 0
    16  0 1 0 158 59 6 0
    16  1 2 0 134 59 6 0
    16 -1 0 0 167 40 5 0
    17  0 1 0 158 41 6 0
    17  1 2 0 136 41 6 0
    17 -1 0 0 181 40 6 0
    18  0 1 0 163 51 4 0
    18  1 2 0 150 51 4 0
    18 -1 0 0 139 59 6 0
    19  0 1 0 154 48 5 0
    19  1 2 0 168 48 5 0
    19 -1 0 0 148 59 6 0
    20  0 1 0 178 40 5 0
    20  1 2 0 155 40 5 0
    20 -1 0 0 141 41 6 0
    21  0 1 0 170 59 8 0
    21  1 2 0 136 59 8 0
    21 -1 0 0 136 41 6 0
    22  0 1 0 159 59 6 0
    22  1 2 0 132 59 6 0
    22 -1 0 0 162 51 4 0
    23  0 1 0 164 47 4 0
    23  1 2 0 160 47 4 0
    end
    The -did- variable is the interaction between treatment and after (time). I am trying to replicate the results from didregress and reghdfe. The following two commands give me the same result.

    Code:
    didregress (bp) (did), group(patient) time(after2)
    reghdfe bp did i.after2, absorb(i.patient)
    I am also trying to test parallel trends assumption. With -didregress- it is simple.
    Code:
     estat ptrends
    Can I replicate this with reghdfe? I tried
    Code:
     reghdfe bp i.treatment##i.after2 if after<1
    How do I "match" this with what estat ptrends gives me?

    Edit: I went through the Statalist post here https://www.statalist.org/forums/for...r-trends-model but couldn't understand how to apply it to my model.
    Last edited by Parul Gupta; 24 Jul 2024, 11:43.

  • #2
    After struggling to figure this out, the problem is that your treatment variable is incorrect.

    This works:



    Code:
    egen treat = max(did), by(patient)
    g post = after2==2
    
    ** DID MODEL
    reghdfe bp did i.after2, absorb(i.patient) cluster(patient)
    
    ** TRENDS MODEL
    reghdfe bp  i.post#1.treat#c.after2,  absorb(patient did after2) vce(cluster patient)
    test 0.post#1.treat#c.after2
    I modified the trend model a bit from the link. Most of the variables are dummy/categ, so you can absorb them.

    I had not realized that ptrends includes all the data, not just the pre period. I suppose the motivation was that including the additional data increases sample size and thus the power of the test.

    Comment


    • #3
      Sorry, I do not get what was wrong with the treatment variable? The results from didregress and reghdfe match using the same dataset. The -did- variable is generated as treatment*(after>0).

      Also, the p-value from estat ptrends is 0.62 but from the reghfde command you showed, it is 0.54. Is it possible to get the same values from both? Basically, I want to understand exactly what regression is estat ptrends running, so that I can replicate it manually.

      Comment


      • #4
        Try this:

        Code:
        gen byte pretreat = (after2 < 2)
        egen byte ever_treated = max(did), by(patient)
        
        qui reghdfe bp i.after2 i.patient i.did ever_treated#pretreat#c.after2, vce(cluster patient)
        test 0.ever_treated#1.pretreat#c.after2
        Edit: cross-posted with #2 and #3.

        You might want to look at the Methods and formulas section of the didregress postestimation help in the manual.
        Last edited by Hemanshu Kumar; 25 Jul 2024, 10:15.

        Comment


        • #5
          This is sort of an unrelated point, but I wish the newer documentation stressed, more than it does at present anyhow, that you don't need to do the interaction term thing. I know it's what we learn in class, so it's natural to think it, but lots of new commands implicitly run on some variation of a TWFE specification, where only a treatment dummy is needed (in fact they must, given that the interaction term manner is inapplicable with staggered adoption).

          Comment


          • #6
            I'm not sure why treatment is wrong, but you have 0 values for treatment for some of the did units in some periods.

            treatment does not enter the reghdfe regression, so it provides the correct result.

            Comment


            • #7
              My approach in #2 gives you exactly what ptrends is doing.

              Comment


              • #8
                Hemanshu's approach gives you output that looks exactly like ptrends output. I just eliminated some of the baggage by absorbing the time fixed effects and did.

                Comment


                • #9
                  George, when I run your code in #2, it replicates the coefficients estimated by didregress and estat ptrends, but not the standard errors.

                  I just eliminated some of the baggage by absorbing the time fixed effects and did.
                  The issue appears to be that using absorb is not trivial; it alters the standard errors. I am not very familiar with reghdfe, so I am not sure why this happens:

                  Code:
                  . reghdfe bp did i.after2, absorb(patient) vce(cluster patient)
                  (MWFE estimator converged in 1 iterations)
                  
                  HDFE Linear regression                            Number of obs   =         66
                  Absorbing 1 HDFE group                            F(   3,     22) =      25.65
                  Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                                    R-squared       =     0.6103
                                                                    Adj R-squared   =     0.3668
                                                                    Within R-sq.    =     0.4338
                  Number of clusters (patient) =         23         Root MSE        =    11.7477
                  
                                                 (Std. err. adjusted for 23 clusters in patient)
                  ------------------------------------------------------------------------------
                               |               Robust
                            bp | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                  -------------+----------------------------------------------------------------
                           did |  -13.10055   5.863917    -2.23   0.036    -25.26157   -.9395265
                               |
                        after2 |
                            1  |   7.816292   3.727973     2.10   0.048     .0849487    15.54763
                            2  |  -4.650909   5.656307    -0.82   0.420    -16.38137    7.079555
                               |
                         _cons |    149.445   2.463895    60.65   0.000     144.3351    154.5548
                  ------------------------------------------------------------------------------
                  
                  Absorbed degrees of freedom:
                  -----------------------------------------------------+
                   Absorbed FE | Categories  - Redundant  = Num. Coefs |
                  -------------+---------------------------------------|
                       patient |        23          23           0    *|
                  -----------------------------------------------------+
                  * = FE nested within cluster; treated as redundant for DoF computation
                  compared with

                  Code:
                  . reghdfe bp did i.after2 i.patient, vce(cluster patient)
                  (MWFE estimator converged in 1 iterations)
                  warning: missing F statistic; dropped variables due to collinearity or too few clusters
                  
                  HDFE Linear regression                            Number of obs   =         66
                  Absorbing 1 HDFE group                            F(  25,     22) =          .
                  Statistics robust to heteroskedasticity           Prob > F        =          .
                                                                    R-squared       =     0.6103
                                                                    Adj R-squared   =     0.3668
                                                                    Within R-sq.    =     0.6103
                  Number of clusters (patient) =         23         Root MSE        =    11.7477
                  
                                                 (Std. err. adjusted for 23 clusters in patient)
                  ------------------------------------------------------------------------------
                               |               Robust
                            bp | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                  -------------+----------------------------------------------------------------
                           did |  -13.10055   7.300518    -1.79   0.086    -28.24089    2.039801
                               |
                        after2 |
                            1  |   7.816292   4.641289     1.68   0.106    -1.809153    17.44174
                            2  |  -4.650909   7.042046    -0.66   0.516    -19.25522      9.9534
                               |
                       patient |
                            2  |   2.666667   5.00e-14  5.3e+13   0.000     2.666667    2.666667
                            3  |   4.333333   4.96e-14  8.7e+13   0.000     4.333333    4.333333
                            4  |         -9   5.27e-14 -1.7e+14   0.000           -9          -9
                            5  |   1.666667   5.07e-14  3.3e+13   0.000     1.666667    1.666667
                            6  |   4.666667   4.99e-14  9.4e+13   0.000     4.666667    4.666667
                            7  |   12.66667   5.04e-14  2.5e+14   0.000     12.66667    12.66667
                            8  |  -2.666667   5.08e-14 -5.3e+13   0.000    -2.666667   -2.666667
                            9  |    10.4468   1.365595     7.65   0.000     7.614728    13.27887
                           10  |  -.2693269   1.118929    -0.24   0.812    -2.589843    2.051189
                           11  |          9   5.65e-14  1.6e+14   0.000            9           9
                           12  |   13.63315   2.433506     5.60   0.000     8.586369    18.67993
                           13  |   8.966485   2.433506     3.68   0.001     3.919702    14.01327
                           14  |  -7.366849   2.433506    -3.03   0.006    -12.41363   -2.320066
                           15  |   8.966485   2.433506     3.68   0.001     3.919702    14.01327
                           16  |   8.299818   2.433506     3.41   0.003     3.253035     13.3466
                           17  |   13.63315   2.433506     5.60   0.000     8.586369    18.67993
                           18  |   5.966485   2.433506     2.45   0.023     .9197019    11.01327
                           19  |   11.96648   2.433506     4.92   0.000     6.919702    17.01327
                           20  |   13.29982   2.433506     5.47   0.000     8.253035     18.3466
                           21  |   2.633151   2.433506     1.08   0.291    -2.413631    7.679934
                           22  |   6.299818   2.433506     2.59   0.017     1.253035     11.3466
                           23  |   16.77225   3.627821     4.62   0.000     9.248615    24.29589
                               |
                         _cons |   143.6451   3.080987    46.62   0.000     137.2555    150.0346
                  ------------------------------------------------------------------------------

                  Comment


                  • #10
                    Just for completeness, the didregress estimates and standard errors line up with the second version of the reghdfe command, without using absorb.

                    Code:
                    . didregress (bp) (did) , group(patient) time(after2)
                    
                    Treatment and time information
                    
                    Time variable: after2
                    Control:       did = 0
                    Treatment:     did = 1
                    -----------------------------------
                                 |   Control  Treatment
                    -------------+---------------------
                    Group        |
                         patient |        13         10
                    -------------+---------------------
                    Time         |
                         Minimum |         0          2
                         Maximum |         1          2
                    -----------------------------------
                    
                    Difference-in-differences regression                        Number of obs = 66
                    Data type: Repeated cross-sectional
                    
                                                   (Std. err. adjusted for 23 clusters in patient)
                    ------------------------------------------------------------------------------
                                 |               Robust
                              bp | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                    -------------+----------------------------------------------------------------
                    ATET         |
                             did |
                       (1 vs 0)  |  -13.10055   7.300518    -1.79   0.086    -28.24089    2.039801
                    ------------------------------------------------------------------------------
                    Note: ATET estimate adjusted for group effects and time effects.
                    Note: Treatment occurs at different times.
                    didregress is of course using areg internally, so we can replicate its results like so:

                    Code:
                    . areg bp i.after2 did , vce(cluster patient) absorb(patient)
                    
                    Linear regression, absorbing indicators            Number of obs     =      66
                    Absorbed variable: patient                         No. of categories =      23
                                                                       F(3, 22)          =   16.55
                                                                       Prob > F          =  0.0000
                                                                       R-squared         =  0.6103
                                                                       Adj R-squared     =  0.3668
                                                                       Root MSE          = 11.7477
                    
                                                   (Std. err. adjusted for 23 clusters in patient)
                    ------------------------------------------------------------------------------
                                 |               Robust
                              bp | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                    -------------+----------------------------------------------------------------
                          after2 |
                              1  |   7.816292   4.641289     1.68   0.106    -1.809153    17.44174
                              2  |  -4.650909   7.042046    -0.66   0.516    -19.25522      9.9534
                                 |
                             did |  -13.10055   7.300518    -1.79   0.086    -28.24089    2.039801
                           _cons |    149.445   3.067525    48.72   0.000     143.0833    155.8066
                    ------------------------------------------------------------------------------

                    Comment


                    • #11
                      When you absorb a fixed effect, the degrees of freedom for the SE calculation increases. Thus, you get lower SE. It's an advantage.

                      Comment


                      • #12
                        Originally posted by George Ford View Post
                        When you absorb a fixed effect, the degrees of freedom for the SE calculation increases. Thus, you get lower SE. It's an advantage.
                        I'm not sure that's generally true. The results of

                        Code:
                        areg bp i.after2 did , vce(cluster patient) absorb(patient)
                        reg bp i.after2 did i.patient, vce(cluster patient)
                        are identical with respect to standard errors, which lines up with my understanding. However, the results of

                        Code:
                        reghdfe bp i.after2 did , vce(cluster patient) absorb(patient)
                        reghdfe bp i.after2 did i.patient, vce(cluster patient)
                        are not similar with respect to standard errors. I do not know why. This is also essentially the difference between your code in #2 and my version in #4, which produce different standard errors -- with the latter replicating those of the didregress command.

                        Comment


                        • #13
                          reghdfe adjusts for the absorbed degrees of freedom, which is correct for the estimation procedure. The question of whether one set of standard errors is correct or wrong should not arise; it depends on the estimation procedure. Using indicators in linear regression [or the least squares dummy variables (LSDV) estimator] employs more degrees of freedom compared to the within estimator. However, the within estimator must account for demeaning of variables in an initial step. You can replicate regress/ areg's standard errors with reghdfe by specifying the option -dofadjust(none)-.

                          Code:
                          * Example generated by -dataex-. For more info, type help dataex
                          clear
                          input byte(patient after) float after2 byte treatment int bp byte(age sleep) float did
                           1 -1 0 1 143 47 5 0
                           1  0 1 1 153 49 5 0
                           1  1 2 1 125 49 5 1
                           2 -1 0 1 153 47 5 0
                           2  0 1 1 146 44 7 0
                           2  1 2 1 130 44 7 1
                           3 -1 0 1 152 59 4 0
                           3  0 1 1 150 52 4 0
                           3  1 2 1 132 52 4 1
                           4 -1 0 1 126 59 4 0
                           4  0 1 1 148 50 5 0
                           4  1 2 1 120 50 5 1
                           5 -1 0 1 141 40 6 0
                           5  0 1 1 153 53 5 0
                           5  1 2 1 132 53 5 1
                           6 -1 0 1 162 40 6 0
                           6  0 1 1 153 44 6 0
                           6  1 2 1 120 44 6 1
                           7 -1 0 1 176 59 6 0
                           7  0 1 1 158 51 6 0
                           7  1 2 1 125 51 6 1
                           8 -1 0 1 134 59 6 0
                           8  0 1 1 149 56 6 0
                           8  1 2 1 130 56 6 1
                           9 -1 0 1 143 41 6 0
                           9  0 1 1 173 49 7 0
                          10  1 2 1 133 49 7 1
                          10 -1 0 1 136 41 6 0
                          11  0 1 1 165 48 4 0
                          11  1 2 1 135 48 4 1
                          11 -1 0 0 148 51 4 0
                          12  0 1 0 158 59 8 0
                          12  1 2 0 159 59 8 0
                          12 -1 0 0 158 51 4 0
                          13  0 1 0 151 47 5 0
                          13  1 2 0 153 47 5 0
                          13 -1 0 0 157 48 5 0
                          14  0 1 0 155 59 4 0
                          14  1 2 0 126 59 4 0
                          14 -1 0 0 131 48 5 0
                          15  0 1 0 153 40 6 0
                          15  1 2 0 162 40 6 0
                          15 -1 0 0 146 40 5 0
                          16  0 1 0 158 59 6 0
                          16  1 2 0 134 59 6 0
                          16 -1 0 0 167 40 5 0
                          17  0 1 0 158 41 6 0
                          17  1 2 0 136 41 6 0
                          17 -1 0 0 181 40 6 0
                          18  0 1 0 163 51 4 0
                          18  1 2 0 150 51 4 0
                          18 -1 0 0 139 59 6 0
                          19  0 1 0 154 48 5 0
                          19  1 2 0 168 48 5 0
                          19 -1 0 0 148 59 6 0
                          20  0 1 0 178 40 5 0
                          20  1 2 0 155 40 5 0
                          20 -1 0 0 141 41 6 0
                          21  0 1 0 170 59 8 0
                          21  1 2 0 136 59 8 0
                          21 -1 0 0 136 41 6 0
                          22  0 1 0 159 59 6 0
                          22  1 2 0 132 59 6 0
                          22 -1 0 0 162 51 4 0
                          23  0 1 0 164 47 4 0
                          23  1 2 0 160 47 4 0
                          end
                          
                          areg bp i.after2 did , vce(cluster patient) absorb(patient)
                          reghdfe bp i.after2 did , vce(cluster patient) absorb(patient) dofadj(none)
                          Res.:

                          Code:
                          . areg bp i.after2 did , vce(cluster patient) absorb(patient)
                          
                          Linear regression, absorbing indicators            Number of obs     =      66
                          Absorbed variable: patient                         No. of categories =      23
                                                                             F(3, 22)          =   16.55
                                                                             Prob > F          =  0.0000
                                                                             R-squared         =  0.6103
                                                                             Adj R-squared     =  0.3668
                                                                             Root MSE          = 11.7477
                          
                                                         (Std. err. adjusted for 23 clusters in patient)
                          ------------------------------------------------------------------------------
                                       |               Robust
                                    bp | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                          -------------+----------------------------------------------------------------
                                after2 |
                                    1  |   7.816292   4.641289     1.68   0.106    -1.809153    17.44174
                                    2  |  -4.650909   7.042046    -0.66   0.516    -19.25522      9.9534
                                       |
                                   did |  -13.10055   7.300518    -1.79   0.086    -28.24089    2.039801
                                 _cons |    149.445   3.067525    48.72   0.000     143.0833    155.8066
                          ------------------------------------------------------------------------------
                          
                          .
                          . reghdfe bp i.after2 did , vce(cluster patient) absorb(patient) dofadj(none)
                          (MWFE estimator converged in 1 iterations)
                          
                          HDFE Linear regression                            Number of obs   =         66
                          Absorbing 1 HDFE group                            F(   3,     22) =      16.55
                          Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                                            R-squared       =     0.6103
                                                                            Adj R-squared   =     0.3668
                                                                            Within R-sq.    =     0.4338
                          Number of clusters (patient) =         23         Root MSE        =    11.7477
                          
                                                         (Std. err. adjusted for 23 clusters in patient)
                          ------------------------------------------------------------------------------
                                       |               Robust
                                    bp | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                          -------------+----------------------------------------------------------------
                                after2 |
                                    1  |   7.816292   4.641289     1.68   0.106    -1.809153    17.44174
                                    2  |  -4.650909   7.042046    -0.66   0.516    -19.25522      9.9534
                                       |
                                   did |  -13.10055   7.300518    -1.79   0.086    -28.24089    2.039801
                                 _cons |    149.445   3.067525    48.72   0.000     143.0833    155.8066
                          ------------------------------------------------------------------------------
                          
                          Absorbed degrees of freedom:
                          -----------------------------------------------------+
                           Absorbed FE | Categories  - Redundant  = Num. Coefs |
                          -------------+---------------------------------------|
                               patient |        23           0          23     |
                          -----------------------------------------------------+
                          Last edited by Andrew Musau; 26 Jul 2024, 11:23.

                          Comment


                          • #14
                            Thanks, Hemanshu, George and Andrew!

                            Comment

                            Working...
                            X