2SLS/ Interpreting Cragg-Donald Wald F statistic and Stock-Yogo weak ID test critical values

Deepika Deshpande

Join Date: Dec 2020
Posts: 107

2SLS/ Interpreting Cragg-Donald Wald F statistic and Stock-Yogo weak ID test critical values

15 Nov 2021, 20:14

Hello, I am running a robustness check using 2SLS and was having trouble interpreting the Cragg-Donald Wald F statistic and Stock-Yogo weak ID test critical values together. In summary:
1. The Cragg-Donald Wald F statistic is > 10 - however,
2. One of the Stock-Yogo weak ID test critical values is greater than the Cragg-Donald Wald F statistic

I am not sure how I should interpret this result i.e. are the IVs that I selected weak or adequate? I am pasting my code below. Any clarifications would be greatly appreciated. Thank you.

Code:

. xtivreg2 Ln_EBIT_ROA Ln_Revenue Ln_LTD_to_Sales Ln_Intangible_Assets CoAge wGDPpc wCPI wDCF wExp
> gr wGDPgr wCons No_of_Regions Ln_PS_RD (l1.Ln_GSD =  Ln_Int_exp  Ln_FSTS_by_Indgrp_Yr) if CoAge>
> =0 & NATION=="UNITED STATES" & NATIONCODE==840 & FSTS>=10 & FSTS <=100 & GENERALINDUSTRYCLASSIFI
> CATION ==1 & Year_<2020 & Year_<YearInactive & Discr_GS_Rev!=1, fe endog (l1.Ln_GSD)
Warning - singleton groups detected.  36 observation(s) not used.

FIXED EFFECTS ESTIMATION
------------------------
Number of groups =       148                    Obs per group: min =         2
                                                               avg =       5.8
                                                               max =        17

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics consistent for homoskedasticity only

                                                      Number of obs =      861
                                                      F( 13,   700) =     4.91
                                                      Prob > F      =   0.0000
Total (centered) SS     =  240.0292164                Centered R2   =  -0.0099
Total (uncentered) SS   =  240.0292164                Uncentered R2 =  -0.0099
Residual SS             =  242.4170823                Root MSE      =    .5831

--------------------------------------------------------------------------------------
         Ln_EBIT_ROA |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------------+----------------------------------------------------------------
              Ln_GSD |
                 L1. |  -1.080281   .4164248    -2.59   0.009    -1.896459   -.2641034
                     |
          Ln_Revenue |   .5031523    .136515     3.69   0.000     .2355878    .7707168
     Ln_LTD_to_Sales |  -.1659439    .034025    -4.88   0.000    -.2326317   -.0992562
Ln_Intangible_Assets |     -.0631   .0425919    -1.48   0.138    -.1465785    .0203786
               CoAge |  -.0303457   .0139135    -2.18   0.029    -.0576157   -.0030758
              wGDPpc |   .0000707   .0000312     2.27   0.023     9.60e-06    .0001318
                wCPI |  -.0052788   .0276587    -0.19   0.849    -.0594889    .0489313
                wDCF |   2.93e-14   1.54e-13     0.19   0.849    -2.72e-13    3.31e-13
              wExpgr |    .009156   .0113405     0.81   0.419     -.013071     .031383
              wGDPgr |  -.0151599   .0337094    -0.45   0.653    -.0812292    .0509094
               wCons |   2.55e-14   5.83e-14     0.44   0.662    -8.88e-14    1.40e-13
       No_of_Regions |   .0875564   .0813537     1.08   0.282    -.0718939    .2470068
            Ln_PS_RD |   -.054201   .0683068    -0.79   0.427    -.1880798    .0796779
--------------------------------------------------------------------------------------
Underidentification test (Anderson canon. corr. LM statistic):          34.438
                                                   Chi-sq(2) P-val =    0.0000
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):               17.738
Stock-Yogo weak ID test critical values: 10% maximal IV size             19.93
                                         15% maximal IV size             11.59
                                         20% maximal IV size              8.75
                                         25% maximal IV size              7.25
Source: Stock-Yogo (2005).  Reproduced by permission.
------------------------------------------------------------------------------
Sargan statistic (overidentification test of all instruments):           0.765
                                                   Chi-sq(1) P-val =    0.3818
-endog- option:
Endogeneity test of endogenous regressors:                               4.278
                                                   Chi-sq(1) P-val =    0.0386
Regressors tested:    L.Ln_GSD
------------------------------------------------------------------------------
Instrumented:         L.Ln_GSD
Included instruments: Ln_Revenue Ln_LTD_to_Sales Ln_Intangible_Assets CoAge
                      wGDPpc wCPI wDCF wExpgr wGDPgr wCons No_of_Regions
                      Ln_PS_RD
Excluded instruments: Ln_Int_exp Ln_FSTS_by_Indgrp_Yr
------------------------------------------------------------------------------

Tags: None

Fei Wang

Join Date: Oct 2021

Posts: 726
#2

15 Nov 2021, 20:38

Stock-Yogo has multiple critical values, indicating different levels of tolerance for inference biases in IV estimation. For example, 10% keeps the bias at a low level while 25% allows the bias to be large. Given that your statistic is between the critical values of 10% and 15% (closer to 10% value), I would say the relevance of your IVs is not bad.
Comment

Deepika Deshpande

Join Date: Dec 2020
Posts: 107

16 Nov 2021, 04:35

Thank you Fei Wang. Your response is very helpful. I have a follow-up question. I needed to find IVs for the squared term and used a simple approach of using the square of the original IV. This time, I get a Cragg-Donald Wald F statistic < 10, however, it is higher than all the Stock-Yogo weak ID test critical values. Should I interpret this as a case of weak IVs? (I am pasting my output below,)

Code:

. xtivreg2 Ln_EBIT_ROA Ln_Revenue Ln_LTD_to_Sales Ln_Intangible_Assets CoAge wGDPpc wCPI wDCF wExp
> gr wGDPgr wCons No_of_Regions Ln_PS_RD (l1.Ln_GSD l1.Ln_GSD_Sqd=  Ln_Int_exp Ln_Int_exp_sqd ) if
>  CoAge>=0 & NATION=="UNITED STATES" & NATIONCODE==840 & FSTS>=10 & FSTS <=100 & GENERALINDUSTRYC
> LASSIFICATION ==1 & Year_<2020 & Year_<YearInactive & Discr_GS_Rev!=1, fe endog (l1.Ln_GSD)
Warning - singleton groups detected.  36 observation(s) not used.

FIXED EFFECTS ESTIMATION
------------------------
Number of groups =       148                    Obs per group: min =         2
                                                               avg =       5.8
                                                               max =        17

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics consistent for homoskedasticity only

                                                      Number of obs =      861
                                                      F( 14,   699) =     3.26
                                                      Prob > F      =   0.0000
Total (centered) SS     =  240.0292164                Centered R2   =  -0.5274
Total (uncentered) SS   =  240.0292164                Uncentered R2 =  -0.5274
Residual SS             =  366.6312397                Root MSE      =    .7171

--------------------------------------------------------------------------------------
         Ln_EBIT_ROA |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------------+----------------------------------------------------------------
              Ln_GSD |
                 L1. |  -1.107586   .6487185    -1.71   0.088    -2.379051     .163879
                     |
          Ln_GSD_Sqd |
                 L1. |   .2694965   .2014521     1.34   0.181    -.1253424    .6643354
                     |
          Ln_Revenue |   .5708208   .1720829     3.32   0.001     .2335444    .9080972
     Ln_LTD_to_Sales |  -.1861396    .043201    -4.31   0.000     -.270812   -.1014672
Ln_Intangible_Assets |  -.0471489   .0533334    -0.88   0.377    -.1516803    .0573826
               CoAge |  -.0366055   .0174361    -2.10   0.036    -.0707796   -.0024313
              wGDPpc |   .0000962   .0000414     2.32   0.020      .000015    .0001773
                wCPI |  -.0125554   .0342342    -0.37   0.714    -.0796531    .0545423
                wDCF |  -5.24e-15   1.92e-13    -0.03   0.978    -3.81e-13    3.70e-13
              wExpgr |    .014282   .0142111     1.00   0.315    -.0135712    .0421352
              wGDPgr |  -.0323283   .0425165    -0.76   0.447    -.1156592    .0510025
               wCons |   4.90e-14   7.36e-14     0.67   0.505    -9.52e-14    1.93e-13
       No_of_Regions |   .1330648   .1049293     1.27   0.205    -.0725928    .3387224
            Ln_PS_RD |  -.0534666   .0862404    -0.62   0.535    -.2224947    .1155615
--------------------------------------------------------------------------------------
Underidentification test (Anderson canon. corr. LM statistic):          14.149
                                                   Chi-sq(1) P-val =    0.0002
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):                7.076
Stock-Yogo weak ID test critical values: 10% maximal IV size              7.03
                                         15% maximal IV size              4.58
                                         20% maximal IV size              3.95
                                         25% maximal IV size              3.63
Source: Stock-Yogo (2005).  Reproduced by permission.
------------------------------------------------------------------------------
Sargan statistic (overidentification test of all instruments):           0.000
                                                 (equation exactly identified)
-endog- option:
Endogeneity test of endogenous regressors:                               9.983
                                                   Chi-sq(1) P-val =    0.0016
Regressors tested:    L.Ln_GSD
------------------------------------------------------------------------------
Instrumented:         L.Ln_GSD L.Ln_GSD_Sqd
Included instruments: Ln_Revenue Ln_LTD_to_Sales Ln_Intangible_Assets CoAge
                      wGDPpc wCPI wDCF wExpgr wGDPgr wCons No_of_Regions
                      Ln_PS_RD
Excluded instruments: Ln_Int_exp Ln_Int_exp_sqd
------------------------------------------------------------------------------

Comment

Fei Wang

Join Date: Oct 2021

Posts: 726
#4

16 Nov 2021, 05:57

In this case, your IVs would be sufficiently strong according to Stock-Yogo. F > 10 is simply a rule of thumb and should not be considered when there is another formal test.
Comment
Deepika Deshpande

Join Date: Dec 2020

Posts: 107
#5

16 Nov 2021, 06:09

Thank you so much Fei Wang. This is most helpful! However, just to clarify, does the Stock-Yogo test override the F-test? In case there is any reference reading material on this, please do point me to it. I am relatively new to these post-estimation tests for 2SLS. Thank you so much.
Comment
Fei Wang

Join Date: Oct 2021

Posts: 726
#6

16 Nov 2021, 06:15

I would suggest "Weak Instruments in IV Regression: Theory and Practice" (2019) by Andrews, Stock and Sun. It's an accessible review on weak IV tests for empirical researchers.
1 like
Comment
Deepika Deshpande

Join Date: Dec 2020

Posts: 107
#7

16 Nov 2021, 16:25

Thanks a lot, Fei Wang. I will look it up.
Comment
Dao DinhNguyen

Join Date: Feb 2021

Posts: 13
#8

24 Nov 2021, 00:51

Dear Fei Wang

I have a question. I use xtivreg2 for panel data with two instruments. The F test > 10 leads to strong IVs. My results could not reject the null hypothesis of the Hansen test, so I might conclude that the overidentification assumption could be satisfied. The first step also provides the significant impacts of IVs on the main endogenous variable.

However, I failed to reject the endogeneity test of endogenous variable (Hausman test). The p value is quite large (> 0.5). I also discussed that problem with my friends, and they said that I could ignore the Hausman test because they doubted this test by using xtivreg2. In my paper, I strongly discuss that the treatment variable should be endogenous because it could be affected by unobserved variables, and the causality between the treatment variable and outcome is quite clear.

What do you think about that problem? Do you think that ignoring the Hausman test is suitable for my work?

Thank you in advance
Dao
Comment
Fei Wang

Join Date: Oct 2021

Posts: 726
#9

24 Nov 2021, 01:13

Dao, actually I never do Hausman test in this situation as differences between OLS and 2SLS estimates may be caused by many reasons and the test results can hardly lead to certain conclusions about the endogeneity of the independent variable. If you think there are endogeneity issues, then use instruments. The most important things are (1) argue IVs are exogenous and (2) test IVs are sufficiently relevant in the first stage.

Last edited by Fei Wang; 24 Nov 2021, 01:29.
Comment
Dao DinhNguyen

Join Date: Feb 2021

Posts: 13
#10

24 Nov 2021, 01:29

Hi Fei Wang

Yes, I follow the test in the first stage, F test, and Hansen test for my work. Thanks for your invaluable comment. I can start writing my paper.

Have a good day
Dao
Comment

Announcement

2SLS/ Interpreting Cragg-Donald Wald F statistic and Stock-Yogo weak ID test critical values

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment