Question about Underidentification Test and Weak Instrument Test in reghdfe

Mingyu Qi

Join Date: May 2020

Posts: 32
#1

Question about Underidentification Test and Weak Instrument Test in reghdfe

24 Oct 2024, 10:19

Hi everyone,

I am using reghdfe from SSC in Stata 18 to run an IV regression on a data set with around 10M records. I included about 800 fixed effects and used two-way clustering for clustered standard errors. Below are the output for my first stage regression. It looks like I failed to reject the H_0 that my instrument is underidentified, but I can reject the H_0 for weak instrument. These results seem to be inconsistent, and I hope if anyone could help me to interpret them.

Thank you in advance for your help!

Summary results for first-stage regressions
-------------------------------------------

(Underid) (Weak id)
Variable | F( 1, 43) P-val | SW Chi-sq( 1) P-val | SW F( 1, 43)
1.endg | 65.52 0.0000 | 67.04 0.0000 | 65.52

NB: first-stage test statistics cluster-robust

Stock-Yogo weak ID F test critical values for single endogenous regressor:
10% maximal IV size 16.38
15% maximal IV size 8.96
20% maximal IV size 6.66
25% maximal IV size 5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for i.i.d. errors only.

Underidentification test
Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified)
Ha: matrix has rank=K1 (identified)
Kleibergen-Paap rk LM statistic Chi-sq(1)=1.44 P-val=0.2302

Weak identification test
Ho: equation is weakly identified
Cragg-Donald Wald F statistic 1322.26
Kleibergen-Paap Wald rk F statistic 65.52

Stock-Yogo weak ID test critical values for K1=1 and L1=1:
10% maximal IV size 16.38
15% maximal IV size 8.96
20% maximal IV size 6.66
25% maximal IV size 5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.

Weak-instrument-robust inference
Tests of joint significance of endogenous regressors B1 in main equation
Ho: B1=0 and orthogonality conditions are valid
Anderson-Rubin Wald test F(1,43)= 11.21 P-val=0.0017
Anderson-Rubin Wald test Chi-sq(1)= 11.47 P-val=0.0007
Stock-Wright LM S statistic Chi-sq(1)= . P-val= .

NB: Underidentification, weak identification and weak-identification-robust
test statistics cluster-robust

Number of clusters (1) N_clust1 = 44
Number of clusters (2) N_clust2 = 418
Number of observations N = 11001968
Number of regressors K = 110
Number of endogenous regressors K1 = 1
Number of instruments L = 110
Number of excluded instruments L1 = 1
Tags: None
Mingyu Qi

Join Date: May 2020

Posts: 32
#2

27 Oct 2024, 14:48

Just want to clarify, my question is in my case, which statistics I should use to determine if the first stage is valid and strong enough? Thank you!
Comment
George Ford

Join Date: Aug 2014

Posts: 3152
#3

28 Oct 2024, 08:49

I think the Underid test is sensitive to having a lot of FE and the two-way clustering, and likely the huge sample.

Just for kicks, what happens if you cluster SE one-way, or drop an FE ?
Comment
Mingyu Qi

Join Date: May 2020

Posts: 32
#4

29 Oct 2024, 08:19

Originally posted by George Ford View Post

I think the Underid test is sensitive to having a lot of FE and the two-way clustering, and likely the huge sample.

Just for kicks, what happens if you cluster SE one-way, or drop an FE ?

Hi George, thank you for your suggestions. I tried using one-way clustering, and still got the similar results (please see below).

Summary results for first-stage regressions
-------------------------------------------

(Underid) (Weak id)
Variable | F( 1, 43) P-val | SW Chi-sq( 1) P-val | SW F( 1, 43)
1.endg | 100.94 0.0000 | 103.32 0.0000 | 100.94

NB: first-stage test statistics cluster-robust

Stock-Yogo weak ID F test critical values for single endogenous regressor:
10% maximal IV size 16.38
15% maximal IV size 8.96
20% maximal IV size 6.66
25% maximal IV size 5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for i.i.d. errors only.

Underidentification test
Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified)
Ha: matrix has rank=K1 (identified)
Kleibergen-Paap rk LM statistic Chi-sq(1)=2.06 P-val=0.1511

Weak identification test
Ho: equation is weakly identified
Cragg-Donald Wald F statistic 1321.87
Kleibergen-Paap Wald rk F statistic 100.94

Stock-Yogo weak ID test critical values for K1=1 and L1=1:
10% maximal IV size 16.38
15% maximal IV size 8.96
20% maximal IV size 6.66
25% maximal IV size 5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.

Weak-instrument-robust inference
Tests of joint significance of endogenous regressors B1 in main equation
Ho: B1=0 and orthogonality conditions are valid
Anderson-Rubin Wald test F(1,43)= 28.88 P-val=0.0000
Anderson-Rubin Wald test Chi-sq(1)= 29.56 P-val=0.0000
Stock-Wright LM S statistic Chi-sq(1)= . P-val= .

NB: Underidentification, weak identification and weak-identification-robust
test statistics cluster-robust

Number of clusters N_clust = 44
Number of observations N = 11001968
Number of regressors K = 110
Number of endogenous regressors K1 = 1
Number of instruments L = 110
Number of excluded instruments L1 = 1
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#5

30 Oct 2024, 15:36

Maybe somebody who is more familiar than me with -reghdfe- can give more specific advice.

But generally speaking, even different tests that test the same null hypothesis are not guaranteed to give the same results. We might hope that they do, but there is no guarantee for that.

In your case the underidentification and the weak identification tests are not even testing the same null hypothesis. Underidentification tests are basically tests of whether the rank of the instruments matrix is equal or bigger than the number of endogenous variables. The weak identification tests whether the correlation of the instruments with the endogenous variables is "sufficient".

So we might hope that these two approaches point in the same direction, but there is no guarantee that they do.
Comment
Mingyu Qi

Join Date: May 2020

Posts: 32
#6

05 Nov 2024, 20:00

Originally posted by Joro Kolev View Post

Maybe somebody who is more familiar than me with -reghdfe- can give more specific advice.

But generally speaking, even different tests that test the same null hypothesis are not guaranteed to give the same results. We might hope that they do, but there is no guarantee for that.

In your case the underidentification and the weak identification tests are not even testing the same null hypothesis. Underidentification tests are basically tests of whether the rank of the instruments matrix is equal or bigger than the number of endogenous variables. The weak identification tests whether the correlation of the instruments with the endogenous variables is "sufficient".

So we might hope that these two approaches point in the same direction, but there is no guarantee that they do.

Hi Joro, thank you for your insights on this!
Comment

Announcement

Question about Underidentification Test and Weak Instrument Test in reghdfe

Comment

Comment

Comment

Comment

Comment