Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Does weakly relationship between instruments lead to higher p-value of over-identification test?

    Dear fellow Stata users:

    I am working on a paper using two instrumental variables (IVs) to identify my explanatory variables. However, I have been informed that my IVs might be weakly correlated, which could affect the reliability of the p-values (Hansen J test) supporting the exogeneity of the instruments. Specifically, the concern raised in the report is that the correlation between the IVs could lead to inflated p-values in the over-identification test.

    As I understand it, the argument is that we are testing whether IV1 and IV2 are correlated with the error term in the structural equation. If the IVs are correlated with each other, the correlations between IV1 and the error term, and IV2 and the error term, may provide redundant information, leading to less power in the test and thus a higher p-value.

    Since I am more focused on applied work, I am not entirely sure about the technical details of this issue. I would appreciate any guidance on how to address this concern in my paper. Moreover, I believe that in many papers using the IV approach with multiple instruments, some degree of correlation between instruments is common. Could anyone provide more insights into this, or point me toward relevant papers that discuss this issue?


    Thank you so much for your help,
    Alex

  • #2
    sample size?

    how correlated are they?

    Comment


    • #3
      are the instruments weak?

      Comment


      • #4
        You are violating the assumption of multicollinearity which states that the conditional residuals of the predictor variables should not be correlated. Often, people misunderstand this assumption, believing it means the predictors themselves should not be correlated (but that isn't quite right). The best solution (in an SEM context) is to try to figure out why the residuals are correlated and model the relationship explicitly. For example, is there some third variable that explains the correlated errors? Some as yet unaccounted for clustering in the data? Correlated errors almost always point to some as yet unaccounted for exogenous feature of the data that should be modeled, which may be why you are getting this feedback.

        I take it George is asking you about sample size and so on because he is trying to assess how bad the issue is. It's fairly typical in most non SEM contexts to ignore small issues with multiple collinearity in a regression because they don't necessarily translate to a large problem for inference. If you search for Clyde Schechter's posts, he's written about this extensively in various places on this forum.

        In the SEM world, there is a fairly influential camp of people who seem to argue that the only good model is the true generative model (whatever that means) and though I am skeptical, they make a good point when they say that correlated errors suggest there is a feature of the data missing from your model. The SEM response is usually to try to incorporate that feature into your model.

        Comment


        • #5
          This "I have been informed that my IVs might be weakly correlated" sounds like a concern for weak instruments, not that z1 and z2 are correlated. (The statement wasn't that they were strongly correlated). Is this a possible interpretation? ivreg2 gives a test for weak instruments.

          Hausman (1978) and Wooldridge (2010, textbook) suggests running the model including only each of the exclusions, and then comparing the results across the 3 alternatives. They should be similar, especially if the exclusions are highly correlated.

          You might also generate a factor using the two and use that as the exclusion. The results should also be close to yours.

          And, if the correlation of the two is low, then that should do it. Variables are typically correlated to some degree, without a problem. Might also run the first-stage and get the VIFs. (This might be all you need).

          If these work out, then you should have a sufficient response.

          Might even try ivreg2h (Lewbel) just for kicks.





          Comment


          • #6
            Originally posted by George Ford View Post
            This "I have been informed that my IVs might be weakly correlated" sounds like a concern for weak instruments, not that z1 and z2 are correlated. (The statement wasn't that they were strongly correlated). Is this a possible interpretation? ivreg2 gives a test for weak instruments.

            Hausman (1978) and Wooldridge (2010, textbook) suggests running the model including only each of the exclusions, and then comparing the results across the 3 alternatives. They should be similar, especially if the exclusions are highly correlated.

            You might also generate a factor using the two and use that as the exclusion. The results should also be close to yours.

            And, if the correlation of the two is low, then that should do it. Variables are typically correlated to some degree, without a problem. Might also run the first-stage and get the VIFs. (This might be all you need).

            If these work out, then you should have a sufficient response.

            Might even try ivreg2h (Lewbel) just for kicks.




            Hi George,

            Thank you so much for your reply! I should have provided more details. The IVs are not weak. The F-stat is more than 100 when combined, and both F-stats are larger than 10 when used separately.

            I think the comment comes from the idea that if there is correlation between IV1 and IV2, this could lead to a higher p-value in the over-identification test. So even though the p-value of my over-identification test is currently 0.5, it's not considered trustworthy because IV1 and IV2 are correlated. But I’m not sure if this is indeed the case—does correlation between IV1 and IV2 lead to an inflated p-value? And if both IVs can predict the endogenous variables, wouldn’t they be correlated anyway?

            Thank you for your suggestion to run the first stage and get the VIFs. Do you have an idea of the benchmark for how weakly correlated the two IVs should be to avoid inflating the p-value of the over-identification test? Any references for this would be appreciated.

            Thank you again for your help,
            Alex

            Comment


            • #7
              In the extreme case, when IV1 and IV2 are perfectly correlated, the p-value of the overidentification test would be 1 irrespective of whether the IVs are valid or not. So, yes, a high correlation between the IVs can inflate the p-values. If the correlation between the IVs is weak, this should generally not cause much of a problem, unless the sample size is very small, in which case these tests can be quite unreliable anyway.

              In principle, problems can also arise with weaker degrees of correlation. Consider the following setup:

              Say, there are two instruments, IV1 and IV2. Both instruments are proxies of the same underlying predictor Z (which is correlated with the endogenous regressor X); the measurement errors in IV1 and IV2 are mutually uncorrelated and uncorrelated with the structural errors. The overidentification test aims to test whether Z is validly excluded from the regression model of y on X. For a given correlation of Z with X, the correlation of the IVs with X (and the correlation of the IVs with each other) is driven by the variance of their measurement errors (high variance -> low correlation, and vice versa). The problem here is that both IVs are either jointly valid instruments (if Z is validly excluded) or jointly invalid instruments (if Z is a relevant predictor of y after controling for X).

              There is, in fact, no overidentification, because all the identifcation comes from the single underlying predictor Z. Even if the IVs are invalid, the overidentification test is expected to return a high p-value. Why is that? The overidentification test effectively contrasts the estimates from the 2SLS regression using both IVs to the estimates from using just one of the two IVs. But both estimators estimate the same pseudo-true value.

              This problem can arise independent of the strength of the IVs correlation with each other. The relevant question is: Are there independent sources of variation in IV1 and IV2 that have predictive power for X? If yes, then you should generally be fine, although the finite-sample performance of the test would still depend on the strength of the correlations.
              https://www.kripfganz.de/stata/

              Comment

              Working...
              X