Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by George Ford View Post
    Hah. I confused your post with another one as far as the data, but this is still useful to you. (Ignore the ppml stuff).

    Limiting pre-treatment periods is common, and often falls in the range of 3-5 periods. The last 2 figures are "money" -- zero pre-treatment effects and then negative and significant treatment effects. You could get by with those results.

    But, if you look at the first figure, you see the entire effect series has a downward trend. Still, all the pretreatment CIs include 0. You can test them all using reghdfe (I could never get "estat leads" to work after evemtdd; I do not know why).

    I'd look at the pre-treatment data carefully to see if something funky is going on when t2g is the range -8, -4. Do certain clubs fall out or re-enter of the sample during that time? That sort of thing.




    Hey George Ford,
    I figured out that my problem was that I didn't insert 0 for untreated teams in the t2t variable. I added the command: "replace t2t = 0 if TreatmentGroup == 0", then ran it again and got these new figures (respectively):

    Click image for larger version

Name:	fig1.PNG
Views:	1
Size:	16.6 KB
ID:	1761294

    Click image for larger version

Name:	fig2.PNG
Views:	1
Size:	13.8 KB
ID:	1761295

    Click image for larger version

Name:	fig3.PNG
Views:	1
Size:	13.9 KB
ID:	1761296


    How does it look now?

    In addition, I ran this code as well, and got these results:
    Code:
    . g time_to_treat = Year - _nfd
    (552 missing values generated)
    
    . replace time_to_treat = 0 if missing(_nfd)
    (552 real changes made)
    
    . g treatt = !missing(_nfd)
    
    . eventdd OverallBalanceDeficit, timevar(time_to_treat) method(hdfe, absorb(team_id Year) cluster (team_id))
    (MWFE estimator converged in 5 iterations)
    
    HDFE Linear regression                            Number of obs   =      1,178
    Absorbing 2 HDFE groups                           F(  21,     76) =       2.55
    Statistics robust to heteroskedasticity           Prob > F        =     0.0016
                                                      R-squared       =     0.4236
                                                      Adj R-squared   =     0.3630
                                                      Within R-sq.    =     0.0277
    Number of clusters (team_id) =         77         Root MSE        =    29.9487
    
                                   (Std. err. adjusted for 77 clusters in team_id)
    ------------------------------------------------------------------------------
                 |               Robust
    OverallBal~t | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
          lead11 |  -.4769937   13.51305    -0.04   0.972    -27.39057    26.43658
          lead10 |  -6.577794    13.5253    -0.49   0.628    -33.51576    20.36017
           lead9 |  -12.73715   18.76164    -0.68   0.499    -50.10419    24.62988
           lead8 |  -3.008794   11.56542    -0.26   0.795    -26.04332    20.02573
           lead7 |  -5.773172   9.715021    -0.59   0.554    -25.12231    13.57597
           lead6 |  -8.482715   10.97212    -0.77   0.442    -30.33558    13.37015
           lead5 |  -7.422983   9.017974    -0.82   0.413    -25.38384    10.53787
           lead4 |  -13.98219   8.643466    -1.62   0.110    -31.19714    3.232769
           lead3 |  -8.853627   8.223487    -1.08   0.285    -25.23212    7.524868
           lead2 |  -9.670522   7.500074    -1.29   0.201    -24.60821     5.26717
            lag0 |  -4.448779   7.645899    -0.58   0.562    -19.67691    10.77935
            lag1 |  -22.27183   7.321703    -3.04   0.003    -36.85426   -7.689393
            lag2 |  -16.29569   8.182697    -1.99   0.050    -32.59294    .0015674
            lag3 |  -17.22842   9.309728    -1.85   0.068    -35.77035    1.313513
            lag4 |  -17.84253   12.30589    -1.45   0.151    -42.35185    6.666782
            lag5 |   -15.0645   7.692217    -1.96   0.054    -30.38487    .2558819
            lag6 |  -23.63615    9.27095    -2.55   0.013    -42.10084   -5.171448
            lag7 |  -17.53306   12.15825    -1.44   0.153    -41.74831    6.682197
            lag8 |  -30.45896   11.39987    -2.67   0.009    -53.16377   -7.754145
            lag9 |  -30.07115   10.49391    -2.87   0.005    -50.97157   -9.170722
           lag10 |  -2.396095   27.59002    -0.09   0.931    -57.34637    52.55418
           _cons |   14.27427    6.95233     2.05   0.043     .4275061    28.12104
    ------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    -----------------------------------------------------+
     Absorbed FE | Categories  - Redundant  = Num. Coefs |
    -------------+---------------------------------------|
         team_id |        77          77           0    *|
            Year |        16           1          15     |
    -----------------------------------------------------+
    * = FE nested within cluster; treated as redundant for DoF computation
    
    . summ time_to_treat
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
    time_to_tr~t |      1,178    .2928693    3.520779        -11         10
    
    . g shifted_ttt = time_to_treat - r(min)
    
    . summ shifted_ttt if time_to_treat == -1
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
     shifted_ttt |         42          10           0         10         10
    
    . local true_neg1 = r(mean)
    
    . reghdfe OverallBalanceDeficit ib`true_neg1'.shifted_ttt, a(team_id Year) vce(cluster team_id)
    (MWFE estimator converged in 5 iterations)
    
    HDFE Linear regression                            Number of obs   =      1,178
    Absorbing 2 HDFE groups                           F(  21,     76) =       2.55
    Statistics robust to heteroskedasticity           Prob > F        =     0.0016
                                                      R-squared       =     0.4236
                                                      Adj R-squared   =     0.3630
                                                      Within R-sq.    =     0.0277
    Number of clusters (team_id) =         77         Root MSE        =    29.9487
    
                                   (Std. err. adjusted for 77 clusters in team_id)
    ------------------------------------------------------------------------------
                 |               Robust
    OverallBal~t | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
     shifted_ttt |
              0  |  -.4769937   13.51305    -0.04   0.972    -27.39057    26.43658
              1  |  -6.577794    13.5253    -0.49   0.628    -33.51576    20.36017
              2  |  -12.73715   18.76164    -0.68   0.499    -50.10419    24.62988
              3  |  -3.008794   11.56542    -0.26   0.795    -26.04332    20.02573
              4  |  -5.773172   9.715021    -0.59   0.554    -25.12231    13.57597
              5  |  -8.482715   10.97212    -0.77   0.442    -30.33558    13.37015
              6  |  -7.422983   9.017974    -0.82   0.413    -25.38384    10.53787
              7  |  -13.98219   8.643466    -1.62   0.110    -31.19714    3.232769
              8  |  -8.853627   8.223487    -1.08   0.285    -25.23212    7.524868
              9  |  -9.670522   7.500074    -1.29   0.201    -24.60821     5.26717
             11  |  -4.448779   7.645899    -0.58   0.562    -19.67691    10.77935
             12  |  -22.27183   7.321703    -3.04   0.003    -36.85426   -7.689393
             13  |  -16.29569   8.182697    -1.99   0.050    -32.59294    .0015674
             14  |  -17.22842   9.309728    -1.85   0.068    -35.77035    1.313513
             15  |  -17.84253   12.30589    -1.45   0.151    -42.35185    6.666782
             16  |   -15.0645   7.692217    -1.96   0.054    -30.38487    .2558819
             17  |  -23.63615    9.27095    -2.55   0.013    -42.10084   -5.171448
             18  |  -17.53306   12.15825    -1.44   0.153    -41.74831    6.682197
             19  |  -30.45896   11.39987    -2.67   0.009    -53.16377   -7.754145
             20  |  -30.07115   10.49391    -2.87   0.005    -50.97157   -9.170722
             21  |  -2.396095   27.59002    -0.09   0.931    -57.34637    52.55418
                 |
           _cons |   14.27427    6.95233     2.05   0.043     .4275061    28.12104
    ------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    -----------------------------------------------------+
     Absorbed FE | Categories  - Redundant  = Num. Coefs |
    -------------+---------------------------------------|
         team_id |        77          77           0    *|
            Year |        16           1          15     |
    -----------------------------------------------------+
    * = FE nested within cluster; treated as redundant for DoF computation
    
    . g coef = .
    (1,178 missing values generated)
    
    . g se = .
    (1,178 missing values generated)
    
    . levelsof shifted_ttt, l(times)
    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
    
    . foreach t in `times' {
      2.         replace coef = _b[`t'.shifted_ttt] if shifted_ttt == `t'
      3.         replace se = _se[`t'.shifted_ttt] if shifted_ttt == `t'
      4. }
    (3 real changes made)
    (3 real changes made)
    (5 real changes made)
    (5 real changes made)
    (9 real changes made)
    (9 real changes made)
    (11 real changes made)
    (11 real changes made)
    (13 real changes made)
    (13 real changes made)
    (26 real changes made)
    (26 real changes made)
    (39 real changes made)
    (39 real changes made)
    (40 real changes made)
    (40 real changes made)
    (40 real changes made)
    (40 real changes made)
    (40 real changes made)
    (40 real changes made)
    (42 real changes made)
    (42 real changes made)
    (594 real changes made)
    (594 real changes made)
    (42 real changes made)
    (42 real changes made)
    (42 real changes made)
    (42 real changes made)
    (42 real changes made)
    (42 real changes made)
    (41 real changes made)
    (41 real changes made)
    (37 real changes made)
    (37 real changes made)
    (32 real changes made)
    (32 real changes made)
    (27 real changes made)
    (27 real changes made)
    (25 real changes made)
    (25 real changes made)
    (21 real changes made)
    (21 real changes made)
    (7 real changes made)
    (7 real changes made)
    
    . g ci_top = coef+1.96*se
    
    . g ci_bottom = coef - 1.96*se
    
    . keep time_to_treat coef se ci_*
    
    . duplicates drop
    
    Duplicates in terms of all variables
    
    (1,156 observations deleted)
    
    . sort time_to_treat
    
    . summ ci_top
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
          ci_top |         22    8.454178    13.99646  -9.503092   51.68034
    
    . local top_range = r(max)
    
    . summ ci_bottom
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
       ci_bottom |         22   -33.38446    12.76922  -56.47253          0
    
    . local bottom_range = r(min)
    
    . twoway (sc coef time_to_treat, connect(line)) (rcap ci_top ci_bottom time_to_treat)(function y = 0, range(time_to_treat)) (function y = 0, range(`bottom_range' `top_range') horiz), x
    > title("Time to Treatment") caption("95% Confidence Intervals Shown")
    Click image for larger version

Name:	fig5.PNG
Views:	1
Size:	23.1 KB
ID:	1761298

    Attached Files

    Comment


    • #17
      Looks great, but sadly the t2treat variable should be missing for all control units. what you've done is turned all the control units into treated units. 0 has meaning in the t2t variable (its the year of treatment).

      Comment


      • #18
        Originally posted by George Ford View Post
        Looks great, but sadly the t2treat variable should be missing for all control units. what you've done is turned all the control units into treated units. 0 has meaning in the t2t variable (its the year of treatment).
        George Ford
        Oh, right, I got confused for a moment. Thanks for your correction.

        So this is the new code I run (only replacing 0 to . in command "replace time_to_treat"). The final plot even looks better I think.
        Code:
        . g time_to_treat = Year - _nfd
        (552 missing values generated)
        
        . replace time_to_treat = . if missing(_nfd)
        (0 real changes made)
        
        . g treatt = !missing(_nfd)
        
        . eventdd OverallBalanceDeficit, timevar(time_to_treat) method(hdfe, absorb(team_id Year) cluster (team_id))
        (MWFE estimator converged in 5 iterations)
        
        HDFE Linear regression                            Number of obs   =      1,178
        Absorbing 2 HDFE groups                           F(  21,     76) =       2.55
        Statistics robust to heteroskedasticity           Prob > F        =     0.0016
                                                          R-squared       =     0.4236
                                                          Adj R-squared   =     0.3630
                                                          Within R-sq.    =     0.0277
        Number of clusters (team_id) =         77         Root MSE        =    29.9487
        
                                       (Std. err. adjusted for 77 clusters in team_id)
        ------------------------------------------------------------------------------
                     |               Robust
        OverallBal~t | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
        -------------+----------------------------------------------------------------
              lead11 |  -.4769937   13.51305    -0.04   0.972    -27.39057    26.43658
              lead10 |  -6.577794    13.5253    -0.49   0.628    -33.51576    20.36017
               lead9 |  -12.73715   18.76164    -0.68   0.499    -50.10419    24.62988
               lead8 |  -3.008794   11.56542    -0.26   0.795    -26.04332    20.02573
               lead7 |  -5.773172   9.715021    -0.59   0.554    -25.12231    13.57597
               lead6 |  -8.482715   10.97212    -0.77   0.442    -30.33558    13.37015
               lead5 |  -7.422983   9.017974    -0.82   0.413    -25.38384    10.53787
               lead4 |  -13.98219   8.643466    -1.62   0.110    -31.19714    3.232769
               lead3 |  -8.853627   8.223487    -1.08   0.285    -25.23212    7.524868
               lead2 |  -9.670522   7.500074    -1.29   0.201    -24.60821     5.26717
                lag0 |  -4.448779   7.645899    -0.58   0.562    -19.67691    10.77935
                lag1 |  -22.27183   7.321703    -3.04   0.003    -36.85426   -7.689393
                lag2 |  -16.29569   8.182697    -1.99   0.050    -32.59294    .0015674
                lag3 |  -17.22842   9.309728    -1.85   0.068    -35.77035    1.313513
                lag4 |  -17.84253   12.30589    -1.45   0.151    -42.35185    6.666782
                lag5 |   -15.0645   7.692217    -1.96   0.054    -30.38487    .2558819
                lag6 |  -23.63615    9.27095    -2.55   0.013    -42.10084   -5.171448
                lag7 |  -17.53306   12.15825    -1.44   0.153    -41.74831    6.682197
                lag8 |  -30.45896   11.39987    -2.67   0.009    -53.16377   -7.754145
                lag9 |  -30.07115   10.49391    -2.87   0.005    -50.97157   -9.170722
               lag10 |  -2.396095   27.59002    -0.09   0.931    -57.34637    52.55418
               _cons |   12.18962   3.688874     3.30   0.001     4.842587    19.53664
        ------------------------------------------------------------------------------
        
        Absorbed degrees of freedom:
        -----------------------------------------------------+
         Absorbed FE | Categories  - Redundant  = Num. Coefs |
        -------------+---------------------------------------|
             team_id |        77          77           0    *|
                Year |        16           1          15     |
        -----------------------------------------------------+
        * = FE nested within cluster; treated as redundant for DoF computation
        
        . summ time_to_treat
        
            Variable |        Obs        Mean    Std. dev.       Min        Max
        -------------+---------------------------------------------------------
        time_to_tr~t |        626    .5511182    4.816781        -11         10
        
        . g shifted_ttt = time_to_treat - r(min)
        (552 missing values generated)
        
        . summ shifted_ttt if time_to_treat == -1
        
            Variable |        Obs        Mean    Std. dev.       Min        Max
        -------------+---------------------------------------------------------
         shifted_ttt |         42          10           0         10         10
        
        . local true_neg1 = r(mean)
        
        . reghdfe OverallBalanceDeficit ib`true_neg1'.shifted_ttt, a(team_id Year) vce(cluster team_id)
        (MWFE estimator converged in 5 iterations)
        note: 21.shifted_ttt omitted because of collinearity
        
        HDFE Linear regression                            Number of obs   =        626
        Absorbing 2 HDFE groups                           F(  20,     41) =       2.67
        Statistics robust to heteroskedasticity           Prob > F        =     0.0038
                                                          R-squared       =     0.4165
                                                          Adj R-squared   =     0.3358
                                                          Within R-sq.    =     0.0316
        Number of clusters (team_id) =         42         Root MSE        =    32.0265
        
                                       (Std. err. adjusted for 42 clusters in team_id)
        ------------------------------------------------------------------------------
                     |               Robust
        OverallBal~t | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
        -------------+----------------------------------------------------------------
         shifted_ttt |
                  0  |   6.233777   35.84249     0.17   0.863    -66.15161    78.61916
                  1  |  -.5862694   32.28743    -0.02   0.986    -65.79205    64.61951
                  2  |  -8.183714   34.07579    -0.24   0.811    -77.00116    60.63374
                  3  |   1.072518   25.37419     0.04   0.966    -50.17169    52.31673
                  4  |  -2.384474   21.15712    -0.11   0.911    -45.11215     40.3432
                  5  |  -4.759514   20.81112    -0.23   0.820    -46.78842     37.2694
                  6  |  -3.867176    14.9387    -0.26   0.797    -34.03649    26.30214
                  7  |  -12.01325   12.64025    -0.95   0.347    -37.54074    13.51425
                  8  |  -7.688519   9.532608    -0.81   0.425    -26.94001    11.56297
                  9  |  -8.829516   8.552657    -1.03   0.308    -26.10196    8.442925
                 11  |  -4.712178   7.537397    -0.63   0.535    -19.93426     10.5099
                 12  |   -22.1243   8.975476    -2.46   0.018    -40.25064   -3.997953
                 13  |  -19.12849   10.08284    -1.90   0.065    -39.49121    1.234218
                 14  |  -22.05654    12.5053    -1.76   0.085    -47.31151    3.198427
                 15  |  -19.92374   15.87835    -1.25   0.217    -51.99071    12.14323
                 16  |  -16.18352   13.78461    -1.17   0.247     -44.0221    11.65506
                 17  |  -24.90229   20.78474    -1.20   0.238    -66.87793    17.07335
                 18  |  -20.13522   22.16379    -0.91   0.369    -64.89591    24.62547
                 19  |   -32.4268   23.70468    -1.37   0.179    -80.29938    15.44578
                 20  |  -28.91283   25.41484    -1.14   0.262    -80.23915    22.41349
                 21  |          0  (omitted)
                     |
               _cons |   18.51378   7.586495     2.44   0.019     3.192545    33.83502
        ------------------------------------------------------------------------------
        
        Absorbed degrees of freedom:
        -----------------------------------------------------+
         Absorbed FE | Categories  - Redundant  = Num. Coefs |
        -------------+---------------------------------------|
             team_id |        42          42           0    *|
                Year |        16           1          15     |
        -----------------------------------------------------+
        * = FE nested within cluster; treated as redundant for DoF computation
        
        . g coef = .
        (1,178 missing values generated)
        
        . g se = .
        (1,178 missing values generated)
        
        . levelsof shifted_ttt, l(times)
        0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
        
        . foreach t in `times' {
          2.         replace coef = _b[`t'.shifted_ttt] if shifted_ttt == `t'
          3.         replace se = _se[`t'.shifted_ttt] if shifted_ttt == `t'
          4. }
        (3 real changes made)
        (3 real changes made)
        (5 real changes made)
        (5 real changes made)
        (9 real changes made)
        (9 real changes made)
        (11 real changes made)
        (11 real changes made)
        (13 real changes made)
        (13 real changes made)
        (26 real changes made)
        (26 real changes made)
        (39 real changes made)
        (39 real changes made)
        (40 real changes made)
        (40 real changes made)
        (40 real changes made)
        (40 real changes made)
        (40 real changes made)
        (40 real changes made)
        (42 real changes made)
        (42 real changes made)
        (42 real changes made)
        (42 real changes made)
        (42 real changes made)
        (42 real changes made)
        (42 real changes made)
        (42 real changes made)
        (42 real changes made)
        (42 real changes made)
        (41 real changes made)
        (41 real changes made)
        (37 real changes made)
        (37 real changes made)
        (32 real changes made)
        (32 real changes made)
        (27 real changes made)
        (27 real changes made)
        (25 real changes made)
        (25 real changes made)
        (21 real changes made)
        (21 real changes made)
        (7 real changes made)
        (7 real changes made)
        
        . g ci_top = coef+1.96*se
        (552 missing values generated)
        
        . g ci_bottom = coef - 1.96*se
        (552 missing values generated)
        
        . keep time_to_treat coef se ci_*
        
        . duplicates drop
        
        Duplicates in terms of all variables
        
        (1,155 observations deleted)
        
        . sort time_to_treat
        
        . summ ci_top
        
            Variable |        Obs        Mean    Std. dev.       Min        Max
        -------------+---------------------------------------------------------
              ci_top |         22    22.06977    22.57042  -4.532363   76.48507
        
        . local top_range = r(max)
        
        . summ ci_bottom
        
            Variable |        Obs        Mean    Std. dev.       Min        Max
        -------------+---------------------------------------------------------
           ci_bottom |         22    -44.9345    22.28383  -78.88798          0
        
        . local bottom_range = r(min)
        
        . twoway (sc coef time_to_treat, connect(line)) (rcap ci_top ci_bottom time_to_treat)(function y = 0, range(time_to_treat)) (function y = 0, range(`bottom_range' `top_range') horiz), x
        > title("Time to Treatment") caption("95% Confidence Intervals Shown")
        Click image for larger version

Name:	fig1.PNG
Views:	1
Size:	22.7 KB
ID:	1761307


        What do you think about that?

        Comment


        • #19
          This looks promising, no?

          time_to_treat will be missing, so you can leave out the replace (you can see nothing changed).

          What's the graph look like from eventdd? The estimates show a lot of significance during the treatment period.
          You need to figure out what's up with lag10. Probably a sample size issue. It may need to be dropped.

          I'm not sure what you're up to with the reghdfe stuff. I don't think it's giving you anything different than eventdd. What's you're goal there? It's not the same estimates as eventdd.

          If you want to use reghdfe to mimic eventdd, use keepdummies in eventdd (you have to save the dta just before) and just use lead* lag* as regressors. That will let it provide exactly the same result.
          Last edited by George Ford; 11 Aug 2024, 15:10.

          Comment


          • #20
            Originally posted by George Ford View Post
            time_to_treat will be missing, so you can leave out the replace (you can see nothing changed).

            What's the graph look like from eventdd? The estimates show a lot of significance during the treatment period.
            You need to figure out what's up with lag10. Probably a sample size issue. It may need to be dropped.

            I'm not sure what you're up to with the reghdfe stuff. I don't think it's giving you anything different than eventdd. What's you goal there?

            This looks promising, no?
            Yes, it looks pretty much the same as eventdd. My main concern was the parallel trends assumption that was violated when I initially used csdid. I wanted to ask, what's the differnece between this and csdid? I'm still not sure if, based on these results, I can infer that there is a casual effect, or I should do further things to prove that there is one.

            Comment


            • #21
              You need to carefully study what csdid is up to and whether that's what you want. It includes some PS matching, etc....

              Comment


              • #22
                Originally posted by George Ford View Post
                You need to carefully study what csdid is up to and whether that's what you want. It includes some PS matching, etc....
                How can I formally test the parallel trend assumption in eventdd or reghdfe?

                Comment


                • #23
                  You are supposed to be able to do it using estat leads, but I can't get it to work (even after updating eventdd).

                  You can see that none of the lead terms is statistically significant, so PP holds.

                  To formally test, you could do this to make up for the estat problem (unless estat leads works for you).

                  Code:
                  save dataset, replace 
                   eventdd OverallBalanceDeficit, timevar(time_to_treat) method(hdfe, absorb(team_id Year) cluster(team_id)) keepdummies * try this, but for me it doesn't work estat leads  * here's an alternative test
                  
                  reghdfe OverallBalanceDeficit lead* lag* , absorb(team_id Year) cluster(team_id) 
                  ** test 3 leads for PP; you can add as many as you want
                  testparm lead4 lead3 lead2  // you do not want to reject the null

                  Comment


                  • #24
                    Originally posted by George Ford View Post
                    You are supposed to be able to do it using estat leads, but I can't get it to work (even after updating eventdd).

                    You can see that none of the lead terms is statistically significant, so PP holds.

                    To formally test, you could do this to make up for the estat problem (unless estat leads works for you).

                    Code:
                    save dataset, replace
                    eventdd OverallBalanceDeficit, timevar(time_to_treat) method(hdfe, absorb(team_id Year) cluster(team_id)) keepdummies * try this, but for me it doesn't work estat leads * here's an alternative test
                    
                    reghdfe OverallBalanceDeficit lead* lag* , absorb(team_id Year) cluster(team_id)
                    ** test 3 leads for PP; you can add as many as you want
                    testparm lead4 lead3 lead2 // you do not want to reject the null
                    Thank you George Ford.

                    testparm command didn't reject the null.

                    Another question - when I looked up for the staggered DiD modelling, I saw that in other threads sometimes they use the interaction term (post x treated) in the reghdfe, why is that? shouldn't I use it too in my reghdfe command?

                    Comment


                    • #25
                      post*treated is the DID term in TWFE regression (the coefficient is the DID estimate). That works with a 2x2 DID design (everybody treated at the same time).

                      This is not that. Here, you "pretend" everyone is being treated at the same time by centering the treatment date (period 0). Now, everything is + or - the treatment date (lead, lag). (You ignore calendar year effects).

                      Based on your questions, I recommend you take this week and do nothing but study up on Diff-in-Diffs, staggered treatments, and event-study DID.

                      DID seems easy and there's lots of canned programs that will give you an estimate. But, it's likewise a minefield of possible errors.

                      Angrist/Pischke books are a good start.

                      Scott Cunningham's substack account is excellent and probably a sensible place to start.

                      There's lot of youtube videos.

                      Instats offers some excellent classes, and probably udemy and coursera.

                      It is vital that you understand, as best you can, what is going on underneath the command (e.g., such as why you don't replace missing t2treat variables with zeros).

                      You should be able to construct a toy dataset that produces a set of known results using whatever estimator you plan to use, and a dataset that would give you a biased estimate using that estimator (different trends, etc...). That way you know you really get what's going on under the hood (a bit like what I did above with reghdfe after eventdd, since eventdd uses reghdfe). Toy models are an underutilized teaching method; Scott Cunningham's substack does a lot of it, and it's an excellent way to learn what's what for a specific estimator.

                      I think eventdd looks like a plausible way to proceed in your case. But you need to know how it works, what are its potential shortcomings (a few papers address this within the context of the modern staggered treatments literature), and so forth.

                      Comment


                      • #26
                        Originally posted by George Ford View Post
                        post*treated is the DID term in TWFE regression (the coefficient is the DID estimate). That works with a 2x2 DID design (everybody treated at the same time).

                        This is not that. Here, you "pretend" everyone is being treated at the same time by centering the treatment date (period 0). Now, everything is + or - the treatment date (lead, lag). (You ignore calendar year effects).

                        Based on your questions, I recommend you take this week and do nothing but study up on Diff-in-Diffs, staggered treatments, and event-study DID.

                        DID seems easy and there's lots of canned programs that will give you an estimate. But, it's likewise a minefield of possible errors.

                        Angrist/Pischke books are a good start.

                        Scott Cunningham's substack account is excellent and probably a sensible place to start.

                        There's lot of youtube videos.

                        Instats offers some excellent classes, and probably udemy and coursera.

                        It is vital that you understand, as best you can, what is going on underneath the command (e.g., such as why you don't replace missing t2treat variables with zeros).

                        You should be able to construct a toy dataset that produces a set of known results using whatever estimator you plan to use, and a dataset that would give you a biased estimate using that estimator (different trends, etc...). That way you know you really get what's going on under the hood (a bit like what I did above with reghdfe after eventdd, since eventdd uses reghdfe). Toy models are an underutilized teaching method; Scott Cunningham's substack does a lot of it, and it's an excellent way to learn what's what for a specific estimator.

                        I think eventdd looks like a plausible way to proceed in your case. But you need to know how it works, what are its potential shortcomings (a few papers address this within the context of the modern staggered treatments literature), and so forth.
                        Thank you so much for this help! You helped me immensely.

                        I will definitely read and learn more about it this week.

                        I would just like to draw your attention to this code that I found here: https://lost-stats.github.io/Model_E...ent_study.html

                        You can see the code under Stata (scroll down), and see that they somehow replaced the "time to treat" variable with 0 for missing values in _nfd (treatment year). Still not sure why as it doesn't make sense to me. But I guess the correct way is just like you said.

                        Comment


                        • #27
                          eventdd requires missing values for the controls.

                          the post you provided uses an entirely different approach.

                          Comment


                          • #28
                            Originally posted by George Ford View Post
                            eventdd requires missing values for the controls.

                            the post you provided uses an entirely different approach.
                            From what I've seen it uses staggered DiD using the reghdfe command. I'm not clear with why it is entirely different.

                            Comment


                            • #29
                              Upon inspection, you do get the same results with eventdd if you reset missings to 0. eventdd actually replaces the t2treat variable when it runs, which is interesting and unexpected, and automatically adjusts for . or 0 (assigning a value of -1). I'm not a fan of changing variables without notification and I may recode that portion.

                              It is much cleaner to use eventdd, keep the dummies, and then use reghdfe, since the effects are ordered in the results table. The other approach does not do that. Aside from that, the results are the same, but you have to hunt and peck around to match the coefficients to the exact lead/lag. There is no advantage, and some disadvantages to the alternative approach.



                              Comment


                              • #30
                                Originally posted by George Ford View Post
                                Upon inspection, you do get the same results with eventdd if you reset missings to 0. eventdd actually replaces the t2treat variable when it runs, which is interesting and unexpected, and automatically adjusts for . or 0 (assigning a value of -1). I'm not a fan of changing variables without notification and I may recode that portion.

                                It is much cleaner to use eventdd, keep the dummies, and then use reghdfe, since the effects are ordered in the results table. The other approach does not do that. Aside from that, the results are the same, but you have to hunt and peck around to match the coefficients to the exact lead/lag. There is no advantage, and some disadvantages to the alternative approach.


                                Hi George Ford,

                                1) How can I figure out if my treatment effects are endogenous? and if they are, how can I implement the instrumental variable in the event study code?
                                2) How can I find the ATT in the eventdd command (or reghdfe)?

                                Comment

                                Working...
                                X