Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to obtain "generalised residuals" in control function approach?

    I am in the midst of running "control function approach" where in the first stage I am using xtlogit command (endogenous variable is binary) and xtreg command (outcome variable is continuous) in the second stage to deal with endogeneity. I need to obtain predicted "generalised residuals" after running first stage regression and use that as an additional variable in the second stage. I have not yet been able to figure out a way to do it in Stata. Looking for directions from this group on how I can achieve this using Stata code. Thanks.

  • #2
    I think the "score" option on predict gets you the GR.

    Comment


    • #3
      Hi,

      I am doing something similar albeit without panel data and using a probit first stage.

      I have computed the GR using 3 different codes that reassuringly appear to give the exact same results.

      Z is my instrument.

      In the second stage I just include the GRs along wit my control variables.

      ************************************************** ************************************************** ************
      ** Method 1 **

      Code:
      probit EEV Z $controls      
      predict gr1, score
      ** Method 2 **
      Based on : https://www.statalist.org/forums/for...ith-panel-data
      Code:
      probit EEV Z $controls    
      predict probitxb, xb
      gen pdfprobit = normalden(probitxb)
      gen cdfprobit = normal(probitxb)
      
      gen lamda = pdfprobit/cdfprobit
      gen pdfprobit_n = normalden(-probitxb)
      gen cdfprobit_n = normprob(-probitxb)
      gen lamda_n = pdfprobit_n/cdfprobit_n
      gen gr2 = EEV*lamda - (1 - EEV)*lamda_n
      ** Method 3 **
      Based on https://www.stata.com/statalist/arch.../msg00650.html

      Code:
      probit EEV Z $controls    
      predict xb, xb
      gen gr3 = cond(EEV == 1, normalden(xb)/normal(xb), -normalden(xb)/(1-normal(xb)))
      ************************************************** ************************************************** ************
      I don't understand why Method 1 gives the same results as 2 and 3 (which are obviously the same), since it's supposed to report the "first derivative of the log likelihood with respect to xb", but it apparently works.

      george, do you happen to have an explanation as to why the predict, score command actually computes the GR?

      Comment


      • #4
        Because it is useful, I guess.

        Comment


        • #5
          Thanks George Ford and Clemence Kieny. I note with a panel dataset, method 2 and method 3 to compute generalised residual also does give exactly same numbers. Thank you for sharing this.

          But method 1 does not work as predict score is only available after xtlogit..., fixed effects. And in my case, running xtlogit with fixed effects drops lot of observations, so I am going with random effects. Could there be a way to obtain generalised residuals after xtlogit fixed effects while keeping all observations?


          What I have: xtlogit a b c $control, fe

          where a is my endogenous variable that takes a value of 0/1. b and c are my IV's which also takes a value of 0/1.
          I get an error that states:
          note: multiple positive outcomes within groups encountered.
          note: 14,250 groups (125,319 obs) omitted because of all positive or all negative outcomes.


          Any more insights into my problem would help.

          Comment


          • #6
            Sounds like the outcomes=1 (or 0) for all or nearly all observations within your fixed effect. Is the FE a natural interpretation? Could you use a higher level of fixed effect?

            Comment


            • #7
              Hi George Ford. Thanks for your reply. My outcome variable is at individual level and so is my fixed effect. I do not think I can use a higher level of fixed effect.

              Comment


              • #8
                Look at this, at p. 13-15. It explains how to handle dichotomous first stage for CF.
                I'm curious if you exclude the fixed effects from the first stage and still produce a valid CF (still consistent?). Sounds like a question for Wooldridge (among others) who frequents Statalist.

                HTML Code:
                https://www.irp.wisc.edu/newsevents/workshops/appliedmicroeconometrics/participants/slides/Slides_14.pdf

                Comment


                • #9
                  Thanks George Ford. Indeed. Hoping Jeff Wooldridge could shed some light on this; where we want to run 1st stage panel FE model with endogenous binary outcome and then in 2nd stage, we have a outcome that is continuous. How to implement control function approach for this and generate the associated generalised residual that is to be used in this method.

                  Comment


                  • #10
                    Dear Kushneel Prakash, have you found a solution to your problem? I have a similar problem with generating many missing values after generating the generalized residuals via:
                    Code:
                    predict u2h_fe, score

                    Comment

                    Working...
                    X