Significant OLS results but insignificant IV results and the insignificant endogeneity test

Lumeng Liu

Join Date: Aug 2020

Posts: 8
#1

Significant OLS results but insignificant IV results and the insignificant endogeneity test

25 Aug 2020, 23:11

Dear all,

I want to explore the effects of X on Y. Considering X might be an exogenous variable, I perform both the OLS and IV regression.

OLS gets significant results (coefficient = 0.3, std.err=0.058,p=0.000), while the IV regression gets insignificant results (coefficient=0.2, std.err=0.648, p>0.1). Then I would think that there may be no causal effects of X on Y according to IV.

However, when I do the endogeneity test following by ivregress command, the test is insignificant, indicating no exogenous variable. So How to interpret the results? Should I think OLS gets the real causal effects due to there is no exogenous variable or should take the results of IV that there are no causal effects between X and Y?

Thanks for your help.
Tags: None
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#2

26 Aug 2020, 00:12

I would interpret your results

1. If you trust your instrument, then OLS is fine, and you can carry on with OLS (anyways the difference between the OLS and IV estimates is small and insignificant)
2. Or as a reason to suspect your instrument. It is possible that you have a bad instruments which is also correlated with the error, and exhibits similar bias as OLS.
Comment
Lumeng Liu

Join Date: Aug 2020

Posts: 8
#3

26 Aug 2020, 08:44

Thank you, Joro, for providing these helpful thoughts. It's really difficult to check whether my instrument is OK or not...
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2162
#4

26 Aug 2020, 10:07

Without knowing the context I wouldn’t conclude the difference in OLS and IV is “small.” If Y = log(Wage) and X indicates a college degree then the practical difference is huge. That’s one reason the FAQ asks posters to show Stata output.

For better or worse, we usually take OLS as the default and require strong evidence to use something else. In this case, OLS is not rejected.

Have you checked the strength of your IV?
Comment
Lumeng Liu

Join Date: Aug 2020

Posts: 8
#5

26 Aug 2020, 18:16

Originally posted by Jeff Wooldridge View Post

Without knowing the context I wouldn’t conclude the difference in OLS and IV is “small.” If Y = log(Wage) and X indicates a college degree then the practical difference is huge. That’s one reason the FAQ asks posters to show Stata output.

For better or worse, we usually take OLS as the default and require strong evidence to use something else. In this case, OLS is not rejected.

Have you checked the strength of your IV?

Thank you for your reply, Jeff. Yes, I have checked the strength of my IV (F is larger than 10), therefore I suppose my IV is strong. My X variable is whether or not visiting parks, and my Y variable is life satisfaction. I am using the IV to figure out whether the correlation between the two variables is due to bilateral causality. Although there are insignificant differences between OLS and IV for the whole group, significant differences between OLS and IV were found in subgroups(defined by covariates). So I think maybe I should accept OLS as the real causal effects for the whole group, while for subgroups, IV is also helpful and will find the truth. Does that make sense?
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#6

27 Aug 2020, 01:33

This is a very interesting (sort of philosophical) discussion, Professor Wooldridge, and I respectfully half-disagree with you.

McCloskey, D. N., & Ziliak, S. T. (1996). The standard error of regressions. Journal of economic literature, 34(1), 97-114, and the many papers they subsequently wrote have been pushing hard the notion that we should look at the economic significance (as opposed to statistical significance) of our estimators. Professor Wooldridge in his textbooks promotes the overall healthy practice of thinking of the economic size of the effects we are estimating, as exemplified in his statement below.

In the words of McCloskey (I believe), "a highly statistically significant mouse is not very interesting, and a statistically insignificant elephant is very interesting."

To me the notion that highly significant mouse is not very interesting, i.e., small economics effect is still small even if highly significant, is uncontroversial.

I take issue with the notion that Professor Wooldridge used below, that an insignificant elephant is very interesting. If I see a point estimate/ coefficient=0.2, std.err=0.648, and this estimate is approximately normally distributed, I am 95% confident that the effect is between 0.2-1.96*0.648= -1.07008 and 0.2+1.96*0.648=1.47008. To me this effect is not interesting, regardless of whether 0.2 is 0.2 pence, or 0.2 trillion British pounds. For the simple reason that I know nothing about this effect. Even if it is .2 trillion pounds, still there is reasonable chance that it is -1 trillion, or +1.5 trillion.

Originally posted by Jeff Wooldridge View Post

Without knowing the context I wouldn’t conclude the difference in OLS and IV is “small.” If Y = log(Wage) and X indicates a college degree then the practical difference is huge. That’s one reason the FAQ asks posters to show Stata output.

For better or worse, we usually take OLS as the default and require strong evidence to use something else. In this case, OLS is not rejected.

Have you checked the strength of your IV?
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#7

27 Aug 2020, 01:45

What makes sense is up to you to decide: Are there theories in your field suggesting that the effects should be different by subgroups defined by covariates?

From statistical point of view the very definition of a 5% significance test, is that if you carry it out 100 times, 5 times it will falsely reject the correct null hypothesis. Therefore if you keep on testing for various subgroups, sooner or later you will find some rejections.

You either need to do this in theory inspired way, e.g., if there are no theories suggesting that the effect should be different by subgroups you just test it once for all groups pooled. If there are theories that the effect is different by groups, you do it by groups.

Or if you want to do it in a data exploratory way, you need to control for multiple testing. The simplest way of doing this is through Bonferroni correction.

Originally posted by Lumeng Liu View Post

Thank you for your reply, Jeff. Yes, I have checked the strength of my IV (F is larger than 10), therefore I suppose my IV is strong. My X variable is whether or not visiting parks, and my Y variable is life satisfaction. I am using the IV to figure out whether the correlation between the two variables is due to bilateral causality. Although there are insignificant differences between OLS and IV for the whole group, significant differences between OLS and IV were found in subgroups(defined by covariates). So I think maybe I should accept OLS as the real causal effects for the whole group, while for subgroups, IV is also helpful and will find the truth. Does that make sense?
Comment
Lumeng Liu

Join Date: Aug 2020

Posts: 8
#8

27 Aug 2020, 09:55

Originally posted by Joro Kolev View Post

What makes sense is up to you to decide: Are there theories in your field suggesting that the effects should be different by subgroups defined by covariates?

From statistical point of view the very definition of a 5% significance test, is that if you carry it out 100 times, 5 times it will falsely reject the correct null hypothesis. Therefore if you keep on testing for various subgroups, sooner or later you will find some rejections.

You either need to do this in theory inspired way, e.g., if there are no theories suggesting that the effect should be different by subgroups you just test it once for all groups pooled. If there are theories that the effect is different by groups, you do it by groups.

Or if you want to do it in a data exploratory way, you need to control for multiple testing. The simplest way of doing this is through Bonferroni correction.

Thank you, Joro, for the very inspiring discussions. I agree with you that the final decision should be made according to the theory of my field. After all, statistical methods might be tricks sometimes. Thank you very much!
Comment
Naimat Wazir

Join Date: Jan 2023

Posts: 7
#9

02 Oct 2023, 06:24

Dear Jeff Wooldridge,

I am working the following regression models using repeated cross-sectional data which yields different results:

Code:

First model: reg TFP X controls FE //OLS

OLS result appears negative and highly significant despite changing FE. Pertinent literature shows both + & - results.

Code:

Second model: reg TFP X#Z Controls FE // interaction

Result is negative and significant for var of interest

Code:

Third model: reg TFP X#Z#Y controls FE // interaction

The result turn positive and significant

Code:

Fourth model: ivreg2 TFP controls (X= a b c) FE, gmm2s robust //

As a robustness check, the IV GMM result turns positive and highly significant.
IV validation,
Underidentification test P-values 0.00
Hansen J statistic p-values 0.3421 showing the validity of IV(?).

what does it mean if OLS and valid IV have different signs?
How can I go with the interpretation of of these changing signs?
From the interaction terms, is it safe to say that the r/s is non-linear ?

Thanks,
Naimat
Comment

Announcement

Significant OLS results but insignificant IV results and the insignificant endogeneity test

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment