Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Heckman Two-Step Selection Model & bivariate probit model

    I have a binary outcome and binary endogenous regressor. Suppose these are my variables:
    • Outcome variable: Y (binary)
    • Treatment variable: Whether the person received training or not (training = 1 for treated, training = 0 for untreated).- binary (I have info on both those who receive training and those who don't)
    • Selection bias: Some individuals might self-select into training programs
    • endogeneity exists
    • instrumental variable: Z
    • controls_A: all variables that are related to training
    • controls_B: all variables that impact Y (it might include some controls from control_A)
    I want to control for selection bias and endogeneity. I am trying to use heckman's two-step method on the bivariate probit model. here is what I have tried:

    Code:
    probit training Z controls_A
    
    *Inverse Mills Ratio 
    
    predict imr, score
    
    *Bivariate Probit Model with Selection and Endogeneity
    
    biprobit (Y= training controls_B imr) (training = Z controls_B)
    There are so many methods/commands to fix the selection bias problem (like ivtreatreg, heckprob, etc.) so I wanted to confirm if the method I am using works. Could anyone please confirm if this works? Thanks!

  • #2
    Guest: The inverse Mills ratio is redundant. The biprobit command automatically allows correlation between the two error terms -- which is what adding the IMR does in the linear case. If you were to use regular probit the including IMR might be a useful approximation. Biprobit has a bit of an advantage is that it can be derived exactly under traditional assumptions.

    I would not make the distinction between controls_A and controls_B. Use the largest set in each model. The key is to have the external IV, Z:

    Code:
    biprobit (Y = training controls) (training = Z controls)
    You'll get an estimate of the correlation parameter, which will tell you about the nature and importance of the endogeneity.

    Approximate method:
    Code:
    probit training Z controls
    predict imr, score
    probit Y training controls imr
    You have to adjust the standard errors in the second method for estimation of imr. And you will want to compute the average partial effect of training in each case.
    Last edited by sladmin; 19 Nov 2024, 08:38. Reason: anonymize original poster

    Comment


    • #3
      The biprobit command automatically allows correlation between the two error terms
      oh right, makes sense. I appreciate your help!

      Approximate method:
      Code:
      probit training Z controls
      predict imr, score
      probit Y training controls imr
      Sorry if this isn't the right question, but we shouldn't use the ivregress command in the second stage of Heckman's two-step model approximation (to deal with other endogeneity problems like simultaneity bias) because the IMR would already indirectly capture the effect of Z (though not fully). (and also, because both our endogenous regressor and outcome variable are binary, right?)

      Thanks a lot for your help!

      Comment

      Working...
      X