Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    ..
    Last edited by Devon Smith; 06 Jul 2022, 20:19.

    Comment


    • #17
      Jeff:

      Just had a follow-up question:

      If the homoskedasticity assumption in the structural equation does not hold, does using fitted values as instrument lead to more efficient estimates than using the actual instrument, Z, itself? I have a situation where my instrument also happens to be binary.

      Comment


      • #18
        Devon: There are no guarantees with heteroskedasticity. It could still be more efficient. Use robust standard errors in both cases. With binary Z you’re counting on variation in X to strengthen the IV.

        Comment


        • #19
          Got it! Thanks, Jeff! Appreciate your help.

          Best, Devon.

          Comment


          • #20
            You may want to check the following from Angrist & Krueger (2001):

            We conclude our review of pitfalls with a discussion of functional form issues for both the first and second stages in two-stage least squares estimation. Researchers are sometimes tempted to use probit or logit to generate first-stage predicted values in applications with a dummy endogenous regressor. But this is not necessary and may even do some harm. In two-stage least squares, consistency of the second-stage estimates does not turn on getting the first-stage functional form right (Kelejian, 1971). So using a linear regression for the first-stage estimates generates consistent second-stage estimates even with a dummy endogenous variable. Moreover, using a nonlinear first stage to generate fitted values that are plugged directly into the second-stage equation does not generate consistent estimates unless the nonlinear model happens to be exactly right, a result which makes the dangers of misspecification high.10



            Angrist, J. D., & Krueger, A. B. (2001). Instrumental Variables and the Search for Identification: From Supply and Demand to Natural Experiments. Journal of Economic Perspectives, 15(4), 69–85. https://doi.org/10.1257/jep.15.4.69

            Comment


            • #21
              You may want to check the following from Angrist & Krueger (2001):

              We conclude our review of pitfalls with a discussion of functional form issues for both the first and second stages in two-stage least squares estimation. Researchers are sometimes tempted to use probit or logit to generate first-stage predicted values in applications with a dummy endogenous regressor. But this is not necessary and may even do some harm. In two-stage least squares, consistency of the second-stage estimates does not turn on getting the first-stage functional form right (Kelejian, 1971). So using a linear regression for the first-stage estimates generates consistent second-stage estimates even with a dummy endogenous variable. Moreover, using a nonlinear first stage to generate fitted values that are plugged directly into the second-stage equation does not generate consistent estimates unless the nonlinear model happens to be exactly right, a result which makes the dangers of misspecification high.10



              Angrist, J. D., & Krueger, A. B. (2001). Instrumental Variables and the Search for Identification: From Supply and Demand to Natural Experiments. Journal of Economic Perspectives, 15(4), 69–85. https://doi.org/10.1257/jep.15.4.69

              Comment


              • #22
                Hello Jef, I have a question that is related to this thread, is it okay if my ivreg2 shows that my instruments are strong but then when I do the first step -probit- only one instrument is significant (out of three). Can I still use the estimated probability as an instrument or does it mean that the instruments are questionable?
                Thanks

                Comment


                • #23
                  Originally posted by Jeff Wooldridge View Post
                  Devon: There are no guarantees with heteroskedasticity. It could still be more efficient. Use robust standard errors in both cases. With binary Z you’re counting on variation in X to strengthen the IV.
                  may you please assist in a situation where both dependent and endogenous variables (Y&D) are binary in a simultaneous equation model and there is reverse causality

                  Y = B0 + B1D + B2X + U
                  D = B0 + B1Y+ B2X + U
                  I was reading a method suggested by Maddala (1983) that the two stages can be done using probit ML?.

                  Comment


                  • #24
                    Joao Santos Silva would you please assist

                    I have a similar case and trying to avoid the Forbidden regression. I have two simultaneous equations one for poverty and the other for informal employment specified as follows:
                    Poor = B0 + B1 Informal employment + B2X +B2Z1+ U
                    Informal Employment = B0 + B1 Poor + B2X +B2Z2+ U, where both dependent and endogenous variables are binary for the two equations and vector X has the same exogenous variables, Z1 and Z2 are instruments.

                    I was following Maddala(1983), who suggested estimating probit ML in the first and second stages; however, after reading Angrist, I discovered this is impossible and leads to forbidden regression. Instead, I should use LPM. Kindly assist me in working this out for my two simultaneous equations.

                    Thanks

                    Comment


                    • #25
                      Dear Michael Zuze,

                      Maybe I am missing something, but I would say that the standard in this case is to simply use 2SLS and ignore the fact that the dependent variables are binary. Of course, you need to interpret the results with the necessary caution.

                      Best wishes,

                      Joao

                      Comment


                      • #26
                        thanks Joao Santos Silva so i have tried it as below. Initially i wanted to use biprobit but the system was not converging probably because i specified as there is reverse causality between poor and informal employment thus each is either a dependent or independent variable in one equation.

                        global y1 hhinformal // Informal sector employment, binary
                        global y2 poor // Poverty status, binary
                        global x1 hdmale i.hhage i.hheduc hhmarried hdsize tot_informal urban // Shared predictors
                        global z1 child_under6 m_hseduc // Instruments for equation 1 (hhinformal)
                        global z2 large_firm // Instrument for equation 2 (poor)

                        // First stage regression: predicting hhinformal
                        ivregress 2sls $y1 ($y2 = $z2) $x1 $z1, first

                        // First stage regression: predicting poor
                        ivregress 2sls $y2 ($y1 = $z1) $x1 $z2, first

                        Comment

                        Working...
                        X