Is there any reason why one should not report an F-value of overall significance at the bottom of a 2SLS regression result table?

Franz Hopp

Join Date: Feb 2015

Posts: 42
#1

Is there any reason why one should not report an F-value of overall significance at the bottom of a 2SLS regression result table?

23 Apr 2022, 09:31

Dear Statalist community,

As I am currently in the process of adding a 2SLS robustness check to a paper, I was wondering whether there is any reason why one should not report an F-value (of overall significance) at the bottom of a 2SLS regression result table (just as one would normally report the F-value at the bottom of a regression result table for OLS)?

I recently looked at several examples of 2SLS analyses published in top tier management journals, and I felt a bit surprised that basically none of these papers reported an F-value in their regression result tables. Thus I started to wonder whether there may be any reason why one should not report this value?

I just wanted to make sure that during paper submission, I don’t include a regression result table that shows a gross misunderstanding of mine of the 2SLS procedure, e.g., by including a wrong statistic that doesn’t belong in such a table. In the current version of the paper, we include the F-value of overall significance in the regression result tables for both OLS and 2SLS.

Any advice on the above question would be immensely appreciated.

All the best wishes,

Franz
Tags: 2SLS, f-test, instrumental variables
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17708
#2

23 Apr 2022, 10:24

Franz:
how are interested listers supposed to reply to your query if you do not share what you typed and what Stata gave you back via CODE delimiters (as recommended by the FAQ)?

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment

Franz Hopp

Join Date: Feb 2015
Posts: 42

24 Apr 2022, 04:50

Carlo, thanks for your reply – and please excuse me not adding those information earlier.

The Stata code I used is the following:

HTML Code:

xtivreg2 DV control1 control2 control3 (IV IV_squared = Z Z_squared), fe robust cluster(FirmID)

Where DV = dependent variable, IV = independent variable, Z = instrument

HTML Code:

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on Ticker

Number of clusters (Ticker) =      367                 Number of obs =    3,564
                                                                           F( 18,   366) =     3.51
                                                                           Prob > F      =   0.0022
Total (centered) SS     =  25.88220643              Centered R2   =   0.0296
Total (uncentered) SS   =  25.88220643            Uncentered R2 =   0.0296
 Residual SS             =  25.66431129                Root MSE      =    .1561

-----------------------------------------------------------------------------------------
                        |               Robust
       DV            | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
------------------------+----------------------------------------------------------------
                   IV |    2.333     .492303      2.14   0.001     .0870583     2.016851
    IV_squared |   -12.837    3.365118    -2.15   0.000     -13.82766    -.636645
         Control1 |    0.021    .002148      -1.85   0.087     -.0081904    .0002295
         Control2 |   -0.022    .0244234      1.96   0.156     -.0001047    .0958426
         Control3 |   -0.087    .0444234      3.96   0.292     -.0002077    .0958426

Underidentification test (Kleibergen-Paap rk LM statistic):              8.513
                                                   Chi-sq(1) P-val =                          0.0934
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):              282.293
                         (Kleibergen-Paap rk Wald F statistic):                  26.185
Stock-Yogo weak ID test critical values: 10% maximal IV size              6.43
                                         15% maximal IV size              5.88
                                         20% maximal IV size              4.55
                                         25% maximal IV size              4.73
Source: Stock-Yogo (2005).  Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments):         0.000
                                                 (equation exactly identified)
------------------------------------------------------------------------------

The result in Stata is the following:

And the table I am planning to insert into the paper is the following (the F-value of overall significance that I am concerned about is in the last line):

Code:

 
Model 1



Coefficient
P-value







IV
2.333
0.001


IV_squared
-12.837
0.000


Control1
0.021
0.087


Control2
-0.022
0.156


Control3
-0.087
0.292


Constant
-0.125
0.191







Observations
3,564



Number of firms
367



F-value
3.510
0.002

Last edited by Franz Hopp; 24 Apr 2022, 04:56.

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17708
#4

24 Apr 2022, 07:57

Franz:
I do not know if a singe reply to your question exist.
A temptative answer would consider that in panel data regression with -fe- specification there are more important pieces of information to disseminate, such as within R_sq, sigma_u; sigma_e and rho.
Usually, the F-statistics has a limited informative usefulness, as it basically tests whether analysing the mean of the regressand (ie, no predictors) is (or not) less informative than analysing its conditional (on predictors) mean.
As an aside, whenever we use a community-contributed module, we have to celare it for reasons that are well explained in the FAQ. Thanks.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Franz Hopp

Join Date: Feb 2015

Posts: 42
#5

24 Apr 2022, 16:04

Carlo,

Thank you so much for your reply, you don't know how helpful your comment has been for me! Thanks a ton!

It is great to know that in general it is possible to add the F-value of overall significance in a regression result table for 2SLS (although doing so is of limited value).

One brief follow-up question: Is there also any way of obtaining the F-value of overall significance using the xtivreg command (instead of xtivreg2)? I have tried adding the statistic via the xtivreg command, but so far have failed to do so.

Xtivreg only seems to report "Wald chi2()".

All the best,

Franz

(Also, thank you so much for your comment about writing about community-contributed commands, I will keep this in mind!)
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17708
#6

24 Apr 2022, 16:42

Franz:
-xtivreg- gives back -e(F)- if you include the -small- option or -e(chi2)- otherwise.
That said, my previous comment about the limited usefulness of these piece of information still holds.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment

Announcement

Is there any reason why one should not report an F-value of overall significance at the bottom of a 2SLS regression result table?

Comment

Comment

Comment

Comment

Comment