Estimate 2SLS with multi-level fixed-effects using ivregress vs reghdfe vs ivreghdfe

Cuong Hoang

Join Date: Jan 2018
Posts: 13

Estimate 2SLS with multi-level fixed-effects using ivregress vs reghdfe vs ivreghdfe

08 Sep 2021, 15:01

Hello Stata experts,

at the moment I'm working on a project that requires the use of 2SLS method with fixed-effects included. I'm struggling to make sense of the differences in the estimation results produced by Stata commands: ivregress, reghdfe, and ivreghdfe, and then to make a decision on which one should be used.
More specifically, I regress a production function with Y (measures firm performance) as the explained variable and regressors include production factors (x1 - x4) and (endogenous) city factors cityX5 and cityX6 (says, log of city population and immigration rate, they are my variables of interest). I employ a cross-sectional firm-level data set of a country.

To deal with the endogeneity of city factors, I use city-level variables z1 z2 to instrument for X5, and z3 and z4 to instrument for X6. I add the dummies of sate_industry to control region_industry fixed effects (each state has many cities).

Results produced by ivregress (with some unimportant results are dropped to keep the space):

HTML Code:

. ivregress 2sls Y x1 x2 x3 x4 (cityX5 cityX6= z1 z2 z3 z4) i.state_industry_pairs if sample==1, vce(cluster state_id)
note: 43.state_industry_pairs identifies no observations in the sample
note: 44.state_industry_pairs identifies no observations in the sample
note: 48.state_industry_pairs identifies no observations in the sample
note: 60.state_industry_pairs identifies no observations in the sample
.....
Instrumental variables (2SLS) regression          Number of obs   =    164,343
                                                  Wald chi2(1031) =  181743.21
                                                  Prob > chi2     =     0.0000
                                                  R-squared       =     0.3889
                                                  Root MSE        =     1.1872

                                      (Std. Err. adjusted for 63 clusters in state_id)
--------------------------------------------------------------------------------------
                     |               Robust
                   Y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------------+----------------------------------------------------------------
              cityX5 |   .0329823   .0152646     2.16   0.031     .0030642    .0629004
              cityX6 |   .1052269   .2336223     0.45   0.652    -.3526644    .5631183
                  x1 |   .2926192   .0102266    28.61   0.000     .2725754     .312663
                  x2 |   .0316666   .0350465     0.90   0.366    -.0370233    .1003565
                  x3 |   .0694911   .0182037     3.82   0.000     .0338126    .1051696
                  x4 |   .6947005   .0578016    12.02   0.000     .5814115    .8079895
                     |
state_industry_pairs |
                  2  |   1.686815   .0173291    97.34   0.000      1.65285    1.720779
                  3  |   1.727269   .0193602    89.22   0.000     1.689324    1.765214
                  4  |    .918509   .0260839    35.21   0.000     .8673854    .9696326
                  5  |   1.665119   .0484384    34.38   0.000     1.570181    1.760056
                  6  |   2.387751   .0417277    57.22   0.000     2.305967    2.469536
                  7  |   1.261545   .0535408    23.56   0.000     1.156607    1.366484
                  8  |   .7721015   .0317133    24.35   0.000     .7099446    .8342585
                  9  |   1.387479    .062664    22.14   0.000      1.26466    1.510298
....
                     |
               _cons |    2.09025   .2286741     9.14   0.000     1.642057    2.538443
--------------------------------------------------------------------------------------
Instrumented:  cityX5 cityX6
Instruments:   x1 x2 x3 x4 2.state_industry_pairs 3.state_industry_pairs
               4.state_industry_pairs 5.state_industry_pairs 6.state_industry_pairs
               7.state_industry_pairs 8.state_industry_pairs 9.state_industry_pairs
               10.state_industry_pairs 11.state_industry_pairs
               12.state_industry_pairs 13.state_industry_pairs
               14.state_industry_pairs 15.state_industry_pairs
               16.state_industry_pairs 17.state_industry_pairs
               18.state_industry_pairs 19.state_industry_pairs
               20.state_industry_pairs 21.state_industry_pairs
...

Results produced by ivreghdfe:

HTML Code:

. ivreghdfe Y x1 x2 x3 x4 (cityX5 cityX6= z1 z2 z3 z4) if sample==1, absorb(state_industry_pairs) vce(cluster state_id)
(dropped 55 singleton observations)
(MWFE estimator converged in 1 iterations)

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics consistent for homoskedasticity only

                                                      Number of obs =   164288
                                                      F(  6,164281) =  4595.61
                                                      Prob > F      =   0.0000
Total (centered) SS     =  270763.1577                Centered R2   =   0.1445
Total (uncentered) SS   =  270763.1577                Uncentered R2 =   0.1445
Residual SS             =  231627.2165                Root MSE      =    1.187

------------------------------------------------------------------------------
           Y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cityX5 |   .0329823   .0067151     4.91   0.000     .0198209    .0461438
      cityX6 |   .1052269    .072774     1.45   0.148    -.0374086    .2478625
          x1 |   .2926192   .0020022   146.15   0.000     .2886949    .2965435
          x2 |   .0316666   .0015753    20.10   0.000     .0285791    .0347541
          x3 |   .0694911   .0070562     9.85   0.000     .0556611    .0833211
          x4 |   .6947005    .013997    49.63   0.000     .6672667    .7221343
------------------------------------------------------------------------------
Underidentification test (Anderson canon. corr. LM statistic):         4.3e+04
                                                   Chi-sq(3) P-val =    0.0000
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):              1.4e+04
Stock-Yogo weak ID test critical values:  5% maximal IV relative bias    11.04
                                         10% maximal IV relative bias     7.56
                                         20% maximal IV relative bias     5.57
                                         30% maximal IV relative bias     4.73
                                         10% maximal IV size             16.87
                                         15% maximal IV size              9.93
                                         20% maximal IV size              7.54
                                         25% maximal IV size              6.28
Source: Stock-Yogo (2005).  Reproduced by permission.
------------------------------------------------------------------------------
Sargan statistic (overidentification test of all instruments):           0.064
                                                   Chi-sq(2) P-val =    0.9687
------------------------------------------------------------------------------
Instrumented:         cityX5 cityX6
Included instruments: x1 x2 x3 x4
Excluded instruments: z1 z2 z3 z4
Partialled-out:       _cons
                      nb: total SS, model F and R2s are after partialling-out;
                          any small-sample adjustments include partialled-out
                          variables in regressor count K
------------------------------------------------------------------------------

Absorbed degrees of freedom:
--------------------------------------------------------------+
          Absorbed FE | Categories  - Redundant  = Num. Coefs |
----------------------+---------------------------------------|
 state_industry_pairs |       971         971           0    *|
--------------------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

Results produced by my manual calculation with reghdfe, I run first-stage and second-stage regressions then correct standard errors of estimated coefficients in the second-stage regression with bootstrap:

HTML Code:

program tsls_test
  1. quietly: reghdfe cityX5 z1 z2 z3 z4 x1 x2 x3 x4  if sample==1, absorb(state_industry_pairs) vce(cluster state_id)
  2. predict cityX5n, xb
  3. quietly: reghdfe cityX6 z1 z2 z3 z4 x1 x2 x3 x4  if sample==1, absorb(state_industry_pairs) vce(cluster state_id)
  4. predict cityX6n, xb
  5. quietly: reghdfe y x1 x2 x3 x4 cityX5n cityX6n if sample==1, absorb(state_industry_pairs) vce(cluster state_id)
  6. drop cityX5n cityX6n
  7. end

bootstrap, cluster(state_id) reps(250): tsls_test
(running tsls_test on estimation sample)

Bootstrap replications (250)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
..................................................    50
..................................................   100
..................................................   150
..................................................   200
..................................................   250

HDFE Linear regression                          Number of obs     =    164,288
Absorbing 1 HDFE group                          Replications      =        250
                                                Wald chi2(6)      =    4202.52
                                                Prob > chi2       =     0.0000
                                                R-squared         =     0.3821
                                                Adj R-squared     =     0.3784
                                                Root MSE          =     1.1916

                               (Replications based on 63 clusters in state_id)
------------------------------------------------------------------------------
             |   Observed   Bootstrap                         Normal-based
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   .2926192   .0104536    27.99   0.000     .2721305    .3131079
          x2 |   .0316666    .036746     0.86   0.389    -.0403543    .1036874
          x3 |   .0694911   .0176274     3.94   0.000      .034942    .1040402
          x4 |   .6947005   .0529963    13.11   0.000     .5908297    .7985713
     cityX5n |   .0329823    .015012     2.20   0.028     .0035593    .0624054
     cityX6n |    .105227   .3638874     0.29   0.772    -.6079792    .8184331
       _cons |   4.080161   .2575816    15.84   0.000     3.575311    4.585012
------------------------------------------------------------------------------

Absorbed degrees of freedom:
--------------------------------------------------------------+
          Absorbed FE | Categories  - Redundant  = Num. Coefs |
----------------------+---------------------------------------|
 state_industry_pairs |       971         971           0    *|
--------------------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

As you can see, coefficients of cityX5n and cityX6n are similar across three commands, but SEs are different.
The SEs produced by ivregress are quite close but a bit higher than those with manual calculation (esp cityX6n, SE is .3638874 with manual calculation, but only 0.2336223 through reghdfe, while SEs produced by ivreghdfe seem to be very low, which seems to be suspicious, it seems that they have problem with cluster option.

My questions are:
1. Do you know why many manual calculation does not give strictly similar results to what they are with ivregress?
2. Do you know what is the problem with ivreghdfe in this case? If you share my suspicion of the option cluster, do you suggest any solution?

The point is that, I would like to replace state_industry_pairs with dummies of state and industries at the finer level (more digit in the industry classification) to improve the accuracy of estimation, so there will be more pairs to estimate. In that case, ivregress could not handle the regression due to too many dummies (more than 3,000). The manual calculation can handle any types of dummies but it tends to produce higher SEs which can influence my final verdict on the role of cityX5 and cityX6, in addition, the bootstrap command is running quite slowly.... Finally, ivreghdfe is quickest, produces automatically important identification test, handle dummies very well, but gives values of SEs that are too good (in terms of expectation from theory) to be true. Hence, I really need to understand what is going on the differences in SEs and how to deal with it, because they should be all the same.

I look forward to receiving answers or advice from you to make sense out of this problem. Thanks in advance.

Best regards,
Cuong

Last edited by Cuong Hoang; 08 Sep 2021, 15:05.

Tags: None

Joro Kolev

Join Date: Aug 2018

Posts: 3047
#2

08 Sep 2021, 18:26

This is an interesting exercise. I do not know why -ivreghdfe- gives such optimistic standard errors, and I notice that there is no message at all in ivreghdfe announcing that cluster robust standard errors are being computed. This might be just a formatting issue, or it might be a problem. Try using ivregress without cluster robust option, to see whether you will get such optimistic standard errors. In general the estimation results of ivregress and ivreghdfe should be more or less the same, there is something fishy going on as the standard errors are not.

You have number of clusters just at the border of what is considered too few, I seem to remember seeing in the literature the magic number of 60 clusters as cited as sufficient.

Finally your bootstrap, although commendable because you programmed it all by yourself, is not "state of the art."

The literature concluded (which of course might be just propaganda) that the best bootstrap method for what you are doing is the wild bootstrap.

The good news is that the wild bootstrap is pre-programmed by David Roodman, and it will be very easy for you to check what results you get by wild bootstrap. To that end check the user written -boottest-, which works after ivregress, and might even work after ivreghdfe.
1 like
Comment

Cuong Hoang

Join Date: Jan 2018
Posts: 13

09 Sep 2021, 03:01

Dear Joro,

thanks for replying my post. You made very good points! I reran -ivregress- without the cluster option, and you were right, SEs are almost the same as that with -ivreghdfe-, which proves my suspicion that the command "vce (cluster id)" of -ivreghdfe- fail to implement what it is supposed to do.
I then made a go with another expression: "cluster (id)", and fortunately the SEs become so close to those in -ivregress-, so this is the right way do deal with cluster after -ivreghdfe-:

HTML Code:

ivreghdfe Y x1 x2 x3 x4 (cityX5 cityX6= z1 z2 z3 z4) if sample==1, absorb(state_industry_pairs) cluster(state_id)
(dropped 55 singleton observations)
(MWFE estimator converged in 1 iterations)

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on state_id

Number of clusters (state_id) =     63                Number of obs =   164281
                                                      F(  6,    62) =  1045.93
                                                      Prob > F      =   0.0000
Total (centered) SS     =  270757.4062                Centered R2   =   0.1446
Total (uncentered) SS   =  270757.4062                Uncentered R2 =   0.1446
Residual SS             =  231614.8081                Root MSE      =    1.187

------------------------------------------------------------------------------
             |               Robust
           Y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cityX5 |   .0329321   .0153989     2.14   0.036     .0021501    .0637141
      cityX6 |    .104975   .2350229     0.45   0.657     -.364829     .574779
          x1 |   .2926391   .0103112    28.38   0.000     .2720272    .3132509
          x2 |   .0316801   .0353218     0.90   0.373    -.0389271    .1022872
          x3 |   .0695301   .0183434     3.79   0.000     .0328623     .106198
          x4 |   .6950936   .0583098    11.92   0.000      .578534    .8116532
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic):             15.300
                                                   Chi-sq(3) P-val =    0.0016
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):              1.4e+04
                         (Kleibergen-Paap rk Wald F statistic):         35.482
Stock-Yogo weak ID test critical values:  5% maximal IV relative bias    11.04
                                         10% maximal IV relative bias     7.56
                                         20% maximal IV relative bias     5.57
                                         30% maximal IV relative bias     4.73
                                         10% maximal IV size             16.87
                                         15% maximal IV size              9.93
                                         20% maximal IV size              7.54
                                         25% maximal IV size              6.28
Source: Stock-Yogo (2005).  Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments):         0.005
                                                   Chi-sq(2) P-val =    0.9974
------------------------------------------------------------------------------
Instrumented:         cityX5 cityX6
Included instruments: x1 x2 x3 x4
Excluded instruments: z1 z2 z3 z4
Partialled-out:       _cons
                      nb: total SS, model F and R2s are after partialling-out;
                          any small-sample adjustments include partialled-out
                          variables in regressor count K
------------------------------------------------------------------------------

Absorbed degrees of freedom:
--------------------------------------------------------------+
          Absorbed FE | Categories  - Redundant  = Num. Coefs |
----------------------+---------------------------------------|
 state_industry_pairs |       971         971           0    *|
--------------------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

When it comes to the number of clusters, I tested with various units of geography, the unit I used is the best to control autocorrelation since it produced highest SEs, and it's good to know that the number of clusters are "sufficient".
Wild bootstrap sounds interesting, I will check on -boottest- to see if I can produce the same SEs with separate regressions.
Anyway, my main concern on the strange behavior of -ivreghdfe- has been solved, I really appreciate your detailed discussions and information, it helps!

Best regards,
Cuong

Comment

Andrew Musau

Join Date: Oct 2014

Posts: 10084
#4

10 Sep 2021, 03:53

First, ivreghdfe is from SSC (FAQ Advice #12). Second, I have no clue what version of ivreghdfe you are running as the current version does not allow the option -vce()-. As Joro Kolev pointed out, your ivreghdfe regression did not compute robust standard errors, but reported conventional standard errors. You need the option -cluster()- after updating the command.

Code:

ssc install ivreghdfe, replace
1 like
Comment
Cuong Hoang

Join Date: Jan 2018

Posts: 13
#5

12 Sep 2021, 13:29

Dear Andrew,

thanks for your helpful information. You are right, I did have the old version of -ivreghdfe-, after following your code, if I continue using the option -vce(cluster ...), Stata will warn that this option is invalid, hence -cluster ( ) - is the only option.

Best regards,
Cuong
Comment
Tilman Deissinger

Join Date: Jan 2024

Posts: 2
#6

15 Feb 2024, 05:26

I am using the same IV Approach. However I cannot add absorb(). Then I get the error: last estimates not found r(301). To code runs when I left absorb() out of my code. How can this be?
Comment

Announcement

Estimate 2SLS with multi-level fixed-effects using ivregress vs reghdfe vs ivreghdfe

Comment

Comment

Comment

Comment

Comment