Interpretation Kleibergen-Paap, Cragg-Donald and Stock-Yogo weak ID

Lisa Tara Smith

Join Date: Jun 2021

Posts: 13
#1

Interpretation Kleibergen-Paap, Cragg-Donald and Stock-Yogo weak ID

25 Jun 2021, 07:03

Dear users,

for my thesis I'm working with an IV regression, where I try to see what effect stock option compensation has on innovation of the company.
As an instrument for stock option compensation I use the predicted first year of the cycle. Which I've done below.

ivreghdfe xrd_w (option_value=predictedfirstyear) if hitech==1, a(fyear sic) cluster(gvkey)
(MWFE estimator converged in 4 iterations)

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on gvkey

Number of clusters (gvkey) = 343 Number of obs = 17037
F( 1, 342) = 0.10
Prob > F = 0.7491
Total (centered) SS = 5.37069e+10 Centered R2 = 0.0087
Total (uncentered) SS = 5.37069e+10 Uncentered R2 = 0.0087
Residual SS = 5.32398e+10 Root MSE = 1769

------------------------------------------------------------------------------
| Robust
xrd_w | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
option_value | .0297179 .0928436 0.32 0.749 -.1528984 .2123342
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic): 2.060
Chi-sq(1) P-val = 0.1512
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic): 238.727
(Kleibergen-Paap rk Wald F statistic): 2.191
Stock-Yogo weak ID test critical values: 10% maximal IV size 16.38
15% maximal IV size 8.96
20% maximal IV size 6.66
25% maximal IV size 5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments): 0.000
(equation exactly identified)
------------------------------------------------------------------------------

However, now my question is that for my first stage I have found:

//first stage: instrument regressions
. reghdfe option_value predictedfirstyear, cl(gvkey) a(fyear)
(MWFE estimator converged in 1 iterations)

HDFE Linear regression Number of obs = 141,526
Absorbing 1 HDFE group F( 1, 2585) = 23.06
Statistics robust to heteroskedasticity Prob > F = 0.0000
R-squared = 0.0006
Adj R-squared = 0.0004
Within R-sq. = 0.0004
Number of clusters (gvkey) = 2,586 Root MSE = 6317.1100

(Std. Err. adjusted for 2,586 clusters in gvkey)
------------------------------------------------------------------------------------
| Robust
option_value | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------------+----------------------------------------------------------------
predictedfirstyear | 424.6596 88.43824 4.80 0.000 251.2426 598.0766
_cons | 421.2141 24.09054 17.48 0.000 373.9754 468.4528
------------------------------------------------------------------------------------

With an F-statistic of 23.06 I thought I could reject that my instrument is weak with Staiger and Stock's rule of thumb with F>10.
However, I now have trouble with interpreting the Kleibergen-Paap, Cragg-Donald and Stock-Yogo results in the second stage.
My questions are:
1. For the Kleibergen-Paap rk LM statistic is it true that when you are looking at underidentification you're testing if the instrument is irrelevant and therefore in this case with a p-value of 0.1512 I'm rejecting that the instrument is irrelevant?
2. For the Cragg-Donald Wald F-stat identification test, is it true that I'm again looking at whether the instrument is weak, the same way I did in the first stage regression and with an F-stat of 238.727 can reject that the instrument is weak?
3. How does the Kleibergen-Paap rk Wald F statistic differ from the Cragg-Donald Wald F statistic?
4. Is it true that SY’s tests can be used with multiple endogenous regressors and multiple instruments and therefore should not be used in this case, where I have only 1 instrument.

I would greatly appreciate all help I can get.
Tags: None
Lisa Tara Smith

Join Date: Jun 2021

Posts: 13
#2

25 Jun 2021, 14:00

I also have another question, where I want to use revt ROA and emp as control variables and the predicted first year as the instrument.
However, at the bottom it states: included instruments revt ROA emp and excluded instrument: predicted first year.
What am I doing wrong?

. ivreghdfe xrd_w (option_value=predictedfirstyear) revt ROA emp if hitech==1, a(fyear sic) cluster(gvkey)
(MWFE estimator converged in 4 iterations)

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on gvkey

Number of clusters (gvkey) = 343 Number of obs = 16953
F( 4, 342) = 12.19
Prob > F = 0.0000
Total (centered) SS = 5.36576e+10 Centered R2 = 0.5449
Total (uncentered) SS = 5.36576e+10 Uncentered R2 = 0.5449
Residual SS = 2.44214e+10 Root MSE = 1201

------------------------------------------------------------------------------
| Robust
xrd_w | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
option_value | .1435217 .0696 2.06 0.040 .0066237 .2804197
revt | .086323 .0279078 3.09 0.002 .0314305 .1412155
ROA | 303.733 135.2008 2.25 0.025 37.80316 569.6629
emp | -6.473182 8.158876 -0.79 0.428 -22.52108 9.574713
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic): 2.165
Chi-sq(1) P-val = 0.1412
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic): 234.540
(Kleibergen-Paap rk Wald F statistic): 2.302
Stock-Yogo weak ID test critical values: 10% maximal IV size 16.38
15% maximal IV size 8.96
20% maximal IV size 6.66
25% maximal IV size 5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments): 0.000
(equation exactly identified)
------------------------------------------------------------------------------
Instrumented: option_value
Included instruments: revt ROA emp
Excluded instruments: predictedfirstyear
Partialled-out: _cons
nb: total SS, model F and R2s are after partialling-out;
any small-sample adjustments include partialled-out
variables in regressor count K
------------------------------------------------------------------------------
Comment

Announcement

Interpretation Kleibergen-Paap, Cragg-Donald and Stock-Yogo weak ID

Comment