Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • drdid and reghdfe generate different coefficients once I add time-varying covariates. Why?

    Hello. I am running a diff-in-diff regression with 2 periods. All treated units "idcomune" get treated at the same time, in period 2.

    The "after" variable is built this way:
    Code:
    gen after=0
    
    replace after=1 if period>=2
    These are the results without time-varying covariates, using reghdfe:

    Code:
    reghdfe vote_share ever_treated##after, absorb (idcomune period) cluster(idcomune)
    (MWFE estimator converged in 2 iterations)
    note: 1bn.ever_treated is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e-09)
    note: 1bn.after is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e-09)
    
    HDFE Linear regression                            Number of obs   =     14,618
    Absorbing 2 HDFE groups                           F(   1,   7308) =       5.78
    Statistics robust to heteroskedasticity           Prob > F        =     0.0162
                                                      R-squared       =     0.8938
                                                      Adj R-squared   =     0.7875
                                                      Within R-sq.    =     0.0005
    Number of clusters (idcomune) =      7,309        Root MSE        =     0.0313
    
                                     (Std. err. adjusted for 7,309 clusters in idcomune)
    ------------------------------------------------------------------------------------
                       |               Robust
            vote_share | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------------+----------------------------------------------------------------
        1.ever_treated |          0  (omitted)
               1.after |          0  (omitted)
                       |
    ever_treated#after |
                  1 1  |   .0157764   .0065619     2.40   0.016     .0029132    .0286395
                       |
                 _cons |   .2030246   .0000121  1.7e+04   0.000     .2030009    .2030484
    ------------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    -----------------------------------------------------+
     Absorbed FE | Categories  - Redundant  = Num. Coefs |
    -------------+---------------------------------------|
        idcomune |      7309        7309           0    *|
          period |         2           1           1     |
    -----------------------------------------------------+
    * = FE nested within cluster; treated as redundant for DoF computation
    These are the results without time-varying covariates, using drdid:

    Code:
    drdid vote_share, ivar(idcomune) time(period) treatment(ever_treated) cluster(idcomune) reg
    
    Doubly robust difference-in-differences                 Number of obs = 14,618
    Outcome model  : regression adjustment
    Treatment model: none
                               (Std. err. adjusted for 7,309 clusters in idcomune)
    ------------------------------------------------------------------------------
                 | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
    ATET         |
    ever_treated |
       (1 vs 0)  |   .0157764    .006561     2.40   0.016     .0029171    .0286357
    ------------------------------------------------------------------------------
    Same coefficients.


    Now I add one unit-level time-varying covariate and one regional-level time varying covariate, using reghdfe:

    Code:
    reghdfe vote_share ever_treated##after tasso_disoccup_annoElez popComune_annoElez , absorb (idcomune period) cluster(idcomune)
    (MWFE estimator converged in 2 iterations)
    note: 1bn.ever_treated is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e-09)
    note: 1bn.after is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e-09)
    
    HDFE Linear regression                            Number of obs   =     14,618
    Absorbing 2 HDFE groups                           F(   3,   7308) =      27.95
    Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                      R-squared       =     0.8949
                                                      Adj R-squared   =     0.7897
                                                      Within R-sq.    =     0.0114
    Number of clusters (idcomune) =      7,309        Root MSE        =     0.0311
    
                                          (Std. err. adjusted for 7,309 clusters in idcomune)
    -----------------------------------------------------------------------------------------
                            |               Robust
                 vote_share | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    ------------------------+----------------------------------------------------------------
             1.ever_treated |          0  (omitted)
                    1.after |          0  (omitted)
                            |
         ever_treated#after |
                       1 1  |    .018809   .0067149     2.80   0.005     .0056459    .0319722
                            |
    region_unempl_year      |   .0070316   .0008053     8.73   0.000     .0054529    .0086102
         city_pop_year      |   5.75e-06   3.68e-06     1.56   0.119    -1.47e-06     .000013
                      _cons |   .0852026   .0257956     3.30   0.001     .0346359    .1357693
    -----------------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    -----------------------------------------------------+
     Absorbed FE | Categories  - Redundant  = Num. Coefs |
    -------------+---------------------------------------|
        idcomune |      7309        7309           0    *|
          period |         2           1           1     |
    -----------------------------------------------------+
    * = FE nested within cluster; treated as redundant for DoF computation
    and now using drdid:

    Code:
    drdid vote_share region_unempl_year  city_pop_year, ivar(idcomune) time(period) treatment(ever_treated) cluster(idcomune) reg
    
    Doubly robust difference-in-differences                 Number of obs = 14,618
    Outcome model  : regression adjustment
    Treatment model: none
                               (Std. err. adjusted for 7,309 clusters in idcomune)
    ------------------------------------------------------------------------------
                 | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
    ATET         |
    ever_treated |
       (1 vs 0)  |   .0119468   .0060062     1.99   0.047     .0001748    .0237188
    ------------------------------------------------------------------------------
    Different coefficients.

    Edit:

    Same thing happens using:

    Code:
    reg vote_share region_unempl_year  city_pop_year ever_treated##after i.idcomune i.period, cluster(idcomune)
    All three specifications have the same coefficient before adding time-varying covariates and different coefficients after adding them.

    Why does this happen? Thank you.
    Last edited by Alessandro Bafaro; 11 Nov 2023, 15:41.

  • #2
    Best resource is for you to take a look at Sant’Anna and Zhao2020 paper.
    what you will see is, drdid specification is not the same one as the one you are comparing it with reghdfe

    Comment


    • #3
      Thank you. I'll look it up.

      Comment

      Working...
      X