Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Estimate 2SLS with multi-level fixed-effects using ivregress vs reghdfe vs ivreghdfe

    Hello Stata experts,

    at the moment I'm working on a project that requires the use of 2SLS method with fixed-effects included. I'm struggling to make sense of the differences in the estimation results produced by Stata commands: ivregress, reghdfe, and ivreghdfe, and then to make a decision on which one should be used.
    More specifically, I regress a production function with Y (measures firm performance) as the explained variable and regressors include production factors (x1 - x4) and (endogenous) city factors cityX5 and cityX6 (says, log of city population and immigration rate, they are my variables of interest). I employ a cross-sectional firm-level data set of a country.

    To deal with the endogeneity of city factors, I use city-level variables z1 z2 to instrument for X5, and z3 and z4 to instrument for X6. I add the dummies of sate_industry to control region_industry fixed effects (each state has many cities).

    Results produced by ivregress (with some unimportant results are dropped to keep the space):
    HTML Code:
    . ivregress 2sls Y x1 x2 x3 x4 (cityX5 cityX6= z1 z2 z3 z4) i.state_industry_pairs if sample==1, vce(cluster state_id)
    note: 43.state_industry_pairs identifies no observations in the sample
    note: 44.state_industry_pairs identifies no observations in the sample
    note: 48.state_industry_pairs identifies no observations in the sample
    note: 60.state_industry_pairs identifies no observations in the sample
    .....
    Instrumental variables (2SLS) regression          Number of obs   =    164,343
                                                      Wald chi2(1031) =  181743.21
                                                      Prob > chi2     =     0.0000
                                                      R-squared       =     0.3889
                                                      Root MSE        =     1.1872
    
                                          (Std. Err. adjusted for 63 clusters in state_id)
    --------------------------------------------------------------------------------------
                         |               Robust
                       Y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ---------------------+----------------------------------------------------------------
                  cityX5 |   .0329823   .0152646     2.16   0.031     .0030642    .0629004
                  cityX6 |   .1052269   .2336223     0.45   0.652    -.3526644    .5631183
                      x1 |   .2926192   .0102266    28.61   0.000     .2725754     .312663
                      x2 |   .0316666   .0350465     0.90   0.366    -.0370233    .1003565
                      x3 |   .0694911   .0182037     3.82   0.000     .0338126    .1051696
                      x4 |   .6947005   .0578016    12.02   0.000     .5814115    .8079895
                         |
    state_industry_pairs |
                      2  |   1.686815   .0173291    97.34   0.000      1.65285    1.720779
                      3  |   1.727269   .0193602    89.22   0.000     1.689324    1.765214
                      4  |    .918509   .0260839    35.21   0.000     .8673854    .9696326
                      5  |   1.665119   .0484384    34.38   0.000     1.570181    1.760056
                      6  |   2.387751   .0417277    57.22   0.000     2.305967    2.469536
                      7  |   1.261545   .0535408    23.56   0.000     1.156607    1.366484
                      8  |   .7721015   .0317133    24.35   0.000     .7099446    .8342585
                      9  |   1.387479    .062664    22.14   0.000      1.26466    1.510298
    ....
                         |
                   _cons |    2.09025   .2286741     9.14   0.000     1.642057    2.538443
    --------------------------------------------------------------------------------------
    Instrumented:  cityX5 cityX6
    Instruments:   x1 x2 x3 x4 2.state_industry_pairs 3.state_industry_pairs
                   4.state_industry_pairs 5.state_industry_pairs 6.state_industry_pairs
                   7.state_industry_pairs 8.state_industry_pairs 9.state_industry_pairs
                   10.state_industry_pairs 11.state_industry_pairs
                   12.state_industry_pairs 13.state_industry_pairs
                   14.state_industry_pairs 15.state_industry_pairs
                   16.state_industry_pairs 17.state_industry_pairs
                   18.state_industry_pairs 19.state_industry_pairs
                   20.state_industry_pairs 21.state_industry_pairs
    ...
    Results produced by ivreghdfe:
    HTML Code:
    . ivreghdfe Y x1 x2 x3 x4 (cityX5 cityX6= z1 z2 z3 z4) if sample==1, absorb(state_industry_pairs) vce(cluster state_id)
    (dropped 55 singleton observations)
    (MWFE estimator converged in 1 iterations)
    
    IV (2SLS) estimation
    --------------------
    
    Estimates efficient for homoskedasticity only
    Statistics consistent for homoskedasticity only
    
                                                          Number of obs =   164288
                                                          F(  6,164281) =  4595.61
                                                          Prob > F      =   0.0000
    Total (centered) SS     =  270763.1577                Centered R2   =   0.1445
    Total (uncentered) SS   =  270763.1577                Uncentered R2 =   0.1445
    Residual SS             =  231627.2165                Root MSE      =    1.187
    
    ------------------------------------------------------------------------------
               Y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          cityX5 |   .0329823   .0067151     4.91   0.000     .0198209    .0461438
          cityX6 |   .1052269    .072774     1.45   0.148    -.0374086    .2478625
              x1 |   .2926192   .0020022   146.15   0.000     .2886949    .2965435
              x2 |   .0316666   .0015753    20.10   0.000     .0285791    .0347541
              x3 |   .0694911   .0070562     9.85   0.000     .0556611    .0833211
              x4 |   .6947005    .013997    49.63   0.000     .6672667    .7221343
    ------------------------------------------------------------------------------
    Underidentification test (Anderson canon. corr. LM statistic):         4.3e+04
                                                       Chi-sq(3) P-val =    0.0000
    ------------------------------------------------------------------------------
    Weak identification test (Cragg-Donald Wald F statistic):              1.4e+04
    Stock-Yogo weak ID test critical values:  5% maximal IV relative bias    11.04
                                             10% maximal IV relative bias     7.56
                                             20% maximal IV relative bias     5.57
                                             30% maximal IV relative bias     4.73
                                             10% maximal IV size             16.87
                                             15% maximal IV size              9.93
                                             20% maximal IV size              7.54
                                             25% maximal IV size              6.28
    Source: Stock-Yogo (2005).  Reproduced by permission.
    ------------------------------------------------------------------------------
    Sargan statistic (overidentification test of all instruments):           0.064
                                                       Chi-sq(2) P-val =    0.9687
    ------------------------------------------------------------------------------
    Instrumented:         cityX5 cityX6
    Included instruments: x1 x2 x3 x4
    Excluded instruments: z1 z2 z3 z4
    Partialled-out:       _cons
                          nb: total SS, model F and R2s are after partialling-out;
                              any small-sample adjustments include partialled-out
                              variables in regressor count K
    ------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    --------------------------------------------------------------+
              Absorbed FE | Categories  - Redundant  = Num. Coefs |
    ----------------------+---------------------------------------|
     state_industry_pairs |       971         971           0    *|
    --------------------------------------------------------------+
    * = FE nested within cluster; treated as redundant for DoF computation
    Results produced by my manual calculation with reghdfe, I run first-stage and second-stage regressions then correct standard errors of estimated coefficients in the second-stage regression with bootstrap:
    HTML Code:
    program tsls_test
      1. quietly: reghdfe cityX5 z1 z2 z3 z4 x1 x2 x3 x4  if sample==1, absorb(state_industry_pairs) vce(cluster state_id)
      2. predict cityX5n, xb
      3. quietly: reghdfe cityX6 z1 z2 z3 z4 x1 x2 x3 x4  if sample==1, absorb(state_industry_pairs) vce(cluster state_id)
      4. predict cityX6n, xb
      5. quietly: reghdfe y x1 x2 x3 x4 cityX5n cityX6n if sample==1, absorb(state_industry_pairs) vce(cluster state_id)
      6. drop cityX5n cityX6n
      7. end
    
    bootstrap, cluster(state_id) reps(250): tsls_test
    (running tsls_test on estimation sample)
    
    Bootstrap replications (250)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    ..................................................    50
    ..................................................   100
    ..................................................   150
    ..................................................   200
    ..................................................   250
    
    HDFE Linear regression                          Number of obs     =    164,288
    Absorbing 1 HDFE group                          Replications      =        250
                                                    Wald chi2(6)      =    4202.52
                                                    Prob > chi2       =     0.0000
                                                    R-squared         =     0.3821
                                                    Adj R-squared     =     0.3784
                                                    Root MSE          =     1.1916
    
                                   (Replications based on 63 clusters in state_id)
    ------------------------------------------------------------------------------
                 |   Observed   Bootstrap                         Normal-based
               y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
              x1 |   .2926192   .0104536    27.99   0.000     .2721305    .3131079
              x2 |   .0316666    .036746     0.86   0.389    -.0403543    .1036874
              x3 |   .0694911   .0176274     3.94   0.000      .034942    .1040402
              x4 |   .6947005   .0529963    13.11   0.000     .5908297    .7985713
         cityX5n |   .0329823    .015012     2.20   0.028     .0035593    .0624054
         cityX6n |    .105227   .3638874     0.29   0.772    -.6079792    .8184331
           _cons |   4.080161   .2575816    15.84   0.000     3.575311    4.585012
    ------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    --------------------------------------------------------------+
              Absorbed FE | Categories  - Redundant  = Num. Coefs |
    ----------------------+---------------------------------------|
     state_industry_pairs |       971         971           0    *|
    --------------------------------------------------------------+
    * = FE nested within cluster; treated as redundant for DoF computation
    As you can see, coefficients of cityX5n and cityX6n are similar across three commands, but SEs are different.
    The SEs produced by ivregress are quite close but a bit higher than those with manual calculation (esp cityX6n, SE is .3638874 with manual calculation, but only 0.2336223 through reghdfe, while SEs produced by ivreghdfe seem to be very low, which seems to be suspicious, it seems that they have problem with cluster option.

    My questions are:
    1. Do you know why many manual calculation does not give strictly similar results to what they are with ivregress?
    2. Do you know what is the problem with ivreghdfe in this case? If you share my suspicion of the option cluster, do you suggest any solution?

    The point is that, I would like to replace state_industry_pairs with dummies of state and industries at the finer level (more digit in the industry classification) to improve the accuracy of estimation, so there will be more pairs to estimate. In that case, ivregress could not handle the regression due to too many dummies (more than 3,000). The manual calculation can handle any types of dummies but it tends to produce higher SEs which can influence my final verdict on the role of cityX5 and cityX6, in addition, the bootstrap command is running quite slowly.... Finally, ivreghdfe is quickest, produces automatically important identification test, handle dummies very well, but gives values of SEs that are too good (in terms of expectation from theory) to be true. Hence, I really need to understand what is going on the differences in SEs and how to deal with it, because they should be all the same.

    I look forward to receiving answers or advice from you to make sense out of this problem. Thanks in advance.

    Best regards,
    Cuong
    Last edited by Cuong Hoang; 08 Sep 2021, 16:05.

  • #2
    This is an interesting exercise. I do not know why -ivreghdfe- gives such optimistic standard errors, and I notice that there is no message at all in ivreghdfe announcing that cluster robust standard errors are being computed. This might be just a formatting issue, or it might be a problem. Try using ivregress without cluster robust option, to see whether you will get such optimistic standard errors. In general the estimation results of ivregress and ivreghdfe should be more or less the same, there is something fishy going on as the standard errors are not.

    You have number of clusters just at the border of what is considered too few, I seem to remember seeing in the literature the magic number of 60 clusters as cited as sufficient.

    Finally your bootstrap, although commendable because you programmed it all by yourself, is not "state of the art."

    The literature concluded (which of course might be just propaganda) that the best bootstrap method for what you are doing is the wild bootstrap.

    The good news is that the wild bootstrap is pre-programmed by David Roodman, and it will be very easy for you to check what results you get by wild bootstrap. To that end check the user written -boottest-, which works after ivregress, and might even work after ivreghdfe.

    Comment


    • #3
      Dear Joro,

      thanks for replying my post. You made very good points! I reran -ivregress- without the cluster option, and you were right, SEs are almost the same as that with -ivreghdfe-, which proves my suspicion that the command "vce (cluster id)" of -ivreghdfe- fail to implement what it is supposed to do.
      I then made a go with another expression: "cluster (id)", and fortunately the SEs become so close to those in -ivregress-, so this is the right way do deal with cluster after -ivreghdfe-:

      HTML Code:
      ivreghdfe Y x1 x2 x3 x4 (cityX5 cityX6= z1 z2 z3 z4) if sample==1, absorb(state_industry_pairs) cluster(state_id)
      (dropped 55 singleton observations)
      (MWFE estimator converged in 1 iterations)
      
      IV (2SLS) estimation
      --------------------
      
      Estimates efficient for homoskedasticity only
      Statistics robust to heteroskedasticity and clustering on state_id
      
      Number of clusters (state_id) =     63                Number of obs =   164281
                                                            F(  6,    62) =  1045.93
                                                            Prob > F      =   0.0000
      Total (centered) SS     =  270757.4062                Centered R2   =   0.1446
      Total (uncentered) SS   =  270757.4062                Uncentered R2 =   0.1446
      Residual SS             =  231614.8081                Root MSE      =    1.187
      
      ------------------------------------------------------------------------------
                   |               Robust
                 Y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
            cityX5 |   .0329321   .0153989     2.14   0.036     .0021501    .0637141
            cityX6 |    .104975   .2350229     0.45   0.657     -.364829     .574779
                x1 |   .2926391   .0103112    28.38   0.000     .2720272    .3132509
                x2 |   .0316801   .0353218     0.90   0.373    -.0389271    .1022872
                x3 |   .0695301   .0183434     3.79   0.000     .0328623     .106198
                x4 |   .6950936   .0583098    11.92   0.000      .578534    .8116532
      ------------------------------------------------------------------------------
      Underidentification test (Kleibergen-Paap rk LM statistic):             15.300
                                                         Chi-sq(3) P-val =    0.0016
      ------------------------------------------------------------------------------
      Weak identification test (Cragg-Donald Wald F statistic):              1.4e+04
                               (Kleibergen-Paap rk Wald F statistic):         35.482
      Stock-Yogo weak ID test critical values:  5% maximal IV relative bias    11.04
                                               10% maximal IV relative bias     7.56
                                               20% maximal IV relative bias     5.57
                                               30% maximal IV relative bias     4.73
                                               10% maximal IV size             16.87
                                               15% maximal IV size              9.93
                                               20% maximal IV size              7.54
                                               25% maximal IV size              6.28
      Source: Stock-Yogo (2005).  Reproduced by permission.
      NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
      ------------------------------------------------------------------------------
      Hansen J statistic (overidentification test of all instruments):         0.005
                                                         Chi-sq(2) P-val =    0.9974
      ------------------------------------------------------------------------------
      Instrumented:         cityX5 cityX6
      Included instruments: x1 x2 x3 x4
      Excluded instruments: z1 z2 z3 z4
      Partialled-out:       _cons
                            nb: total SS, model F and R2s are after partialling-out;
                                any small-sample adjustments include partialled-out
                                variables in regressor count K
      ------------------------------------------------------------------------------
      
      Absorbed degrees of freedom:
      --------------------------------------------------------------+
                Absorbed FE | Categories  - Redundant  = Num. Coefs |
      ----------------------+---------------------------------------|
       state_industry_pairs |       971         971           0    *|
      --------------------------------------------------------------+
      * = FE nested within cluster; treated as redundant for DoF computation
      When it comes to the number of clusters, I tested with various units of geography, the unit I used is the best to control autocorrelation since it produced highest SEs, and it's good to know that the number of clusters are "sufficient".
      Wild bootstrap sounds interesting, I will check on -boottest- to see if I can produce the same SEs with separate regressions.
      Anyway, my main concern on the strange behavior of -ivreghdfe- has been solved, I really appreciate your detailed discussions and information, it helps!

      Best regards,
      Cuong

      Comment


      • #4
        First, ivreghdfe is from SSC (FAQ Advice #12). Second, I have no clue what version of ivreghdfe you are running as the current version does not allow the option -vce()-. As Joro Kolev pointed out, your ivreghdfe regression did not compute robust standard errors, but reported conventional standard errors. You need the option -cluster()- after updating the command.

        Code:
        ssc install ivreghdfe, replace

        Comment


        • #5
          Dear Andrew,

          thanks for your helpful information. You are right, I did have the old version of -ivreghdfe-, after following your code, if I continue using the option -vce(cluster ...), Stata will warn that this option is invalid, hence -cluster ( ) - is the only option.

          Best regards,
          Cuong

          Comment


          • #6
            I am using the same IV Approach. However I cannot add absorb(). Then I get the error: last estimates not found r(301). To code runs when I left absorb() out of my code. How can this be?

            Comment

            Working...
            X