Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Control function with fractional endogenous regressor

    Dear All,

    I am wondering about the application of the 2-stage control function approach, specifically when the first stage involves fractional regression and the second stage involves a probit model. Can I incorporate the residuals from the first stage into the second stage? Are there any relevant literature or discussions addressing this issue?

    Thank you!




  • #2
    Does your fractional variable take on zero and/or one?

    Comment


    • #3
      Dear Jeff,

      Yes, the endogenous variable is between zero and one. Thanks.

      Sophia

      Comment


      • #4
        But does it hit those two values? That makes a difference.

        Comment


        • #5
          Dear Jeff,

          The endogenous variable is a continuous variable bounded between zero and one. I am wondering whether I should use the residual (y - ŷ) or (y - Pr(y)) from the fracreg in the first stage. Or can it cause any misspecification?

          Thank you so much!

          Sophia

          Comment


          • #6
            Let y1 be the binary outcome and y2 be the fraction. If 0 < y2 < 1, I would use the log-odds in a linear regression, like this: log(y2/(1 - y2)) on z1, z2 and obtain the residuals, v2hat. Then use probit or logit of y1 on y2, v2hat, z1. The log-odds is likely to give residuals that are independent of the exogenous variables, z1 and z2. The reason v2hat is a valid control function for y2 is because the log-odds is a one-to-one transformation from (0,1) to (-inf,inf).

            Comment


            • #7
              Dear Prof Jeff,

              Thank you for your insightful responses. Although log-odds can transform y2 from (0,1) to (-inf, inf), in computing, that will generate missing values when y2 equals 1. How can I deal with this issue?

              Thanks once again.
              .

              Best,

              Sophia

              Comment


              • #8
                This is why I asked whether you observe any limit values. Because you do, the variable y2 is not "continuous." I can recommend following my 2014 Journal of Econometrics paper. You add the generalized residual from the first-stage estimation, which is fractional logit or probit. The generalized residual for the Bernoulli log likelihood is g(z*b2hat)(y2 - G(z*b2hat))/[G(z*b2hat)(1 - G(z*b2hat))] where g(.) is the chosen probability density (normal or logistic) and G(.) is the associated cumulative distribution function. These are easily computed in Stata with a few commands after you apply fracreg. Then you add these to the binary response in the second stage.

                Comment


                • #9
                  Dear Prof Jeff,

                  Your insightful response has successfully addressed my query. I will calculate the generalized residual from the fracreg and incorporate it into the second stage, as per your guidance. Once again, thank you for your expertise and support.

                  Best,

                  Sophia

                  Comment


                  • #10
                    You’re very welcome. Two final comments. The coefficient on the CF allows you to test the null of no endogeneity. Second, the CF is treated as any other control variable when computing the average partial effect of your endogenous variable.

                    Comment


                    • #11
                      Dear Prof Jeff,

                      Thank you for your insightful comments. I have learned a great deal from your research. Much appreciated.

                      Best,

                      Sophia

                      Comment

                      Working...
                      X