Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Instrumental variables regression with omitted variable bias in first stage

    Dear Stata community,

    I want to use a two-stage least squares model to estimate a structural equation in a system of two simultaneous equations. My setting is very akin to the one discussed here: https://www.stata.com/support/faqs/s...es-regression/

    The system of simultaneous equations consists of the two structural equations (1) and (2) and the corresponding reduced form equations (1r) and (2r)

    Code:
    (1) Y1 = a0 + a1 * Y2 + a2 * X1 + a3 * X2 + e1 (2) Y2 = b0 + b1 * Y1 + b2 * X3 + b3 * X4 + e2 (1r) Y1 = e0 + e1 * X1 + e2 * X2 + e3 * X3 + e4 * x4 + u1 (2r) Y2 = f0 + f1 * X1 + f2 * X2 + f3 * X3 + f4 * x4 + u2
    I want to estimate the structural equation (1) using a 2SLS approach, whereby X3 and X4 are the instruments for Y2.

    As discussed in the above-mentioned post, omitting X4 as an instrument in the first stage regression (i.e., the reduced form for Y2) does not result in biased estimates in the second stage of the 2SLS estimation of (1) (it only implies a loss in efficiency).

    However, in my case, the variable X4 is unobservable but correlated with one of the exogenous variables of the structural equation for Y1, e.g., X1. This implies that something like an “omitted variables problem” arises in the first stage regression, whereby the coefficient for X1, i.e., f1, is biased as it also captures (at least partly) the effect of X4 on Y2 (i.e., f4).

    My question now is: Are the second stage coefficients in the 2SLS regression of (1) still unbiased in this setting even if f1 in the first stage regression is inflated due to the unobserved effect of X4?

    My guess would be, that for the predicted values of Y2, it does not matter whether the fraction of the variance of Y2 explained by X4 enters the fitted values ”directly”, i.e., by observing X4 and estimating f4, or whether it enters “indirectly”, i.e., by means of an inflated coefficient f1.

    Kind regards,

    Lukas

  • #2
    Dear Lukas Esser

    In this case, your first stage cannot be interpreted as a reduced form (because of the omitted variable) but that does not affect the consistency of the second step because the first step can be a simple linear projection and that is what you are doing.

    Best wishes,

    Joao

    Comment


    • #3
      Dear Joao,

      thank you for your answer. It helped me a lot!

      Best wishes,

      Lukas

      Comment

      Working...
      X