Hi all,
I've been searching this forum furiously, consulting the Stata manuals, etc., but cannot find an answer to this question. Something is happening under the hood of the commands heckman/eregress in my selection model that is causing a loss of observations but I cannot figure out what it is.
The problem in a nutshell: I have a heckman selection model that I can replicate in eregress with a selection equation. The selection equation is instances of political violence, where I want to model if political violence occurs in a place or not, so Violence = set of covariates x1-x8. Then this chooses the instances of political violence for the second stage, where I put it against a financial indicator.
When I do this, I receive 985 observations in the probit, which then results in a selected n of about 231.
HOWEVER, when I do the heckman "by hand," running the exact same probit (Violence = x1-x8) it gives me 104 more observations. This of course changes the second-stage considerably when I run OLS by hand.
I have pared down the covariates to the absolute minimum, still a loss of observations between heckman/eregress and probit.
I have summarized the variables and they all have similar availability.
I have tried everything I could think of but cannot figure out what is going on under the hood of heckman/eregress to drop 104 observations consistently that are retained in probit. Is there any diagnostic I can run (I've already studied the first stage of the heckman to death) to figure out which observations are dropped and why?
Thanks!!!
I've been searching this forum furiously, consulting the Stata manuals, etc., but cannot find an answer to this question. Something is happening under the hood of the commands heckman/eregress in my selection model that is causing a loss of observations but I cannot figure out what it is.
The problem in a nutshell: I have a heckman selection model that I can replicate in eregress with a selection equation. The selection equation is instances of political violence, where I want to model if political violence occurs in a place or not, so Violence = set of covariates x1-x8. Then this chooses the instances of political violence for the second stage, where I put it against a financial indicator.
When I do this, I receive 985 observations in the probit, which then results in a selected n of about 231.
HOWEVER, when I do the heckman "by hand," running the exact same probit (Violence = x1-x8) it gives me 104 more observations. This of course changes the second-stage considerably when I run OLS by hand.
I have pared down the covariates to the absolute minimum, still a loss of observations between heckman/eregress and probit.
I have summarized the variables and they all have similar availability.
I have tried everything I could think of but cannot figure out what is going on under the hood of heckman/eregress to drop 104 observations consistently that are retained in probit. Is there any diagnostic I can run (I've already studied the first stage of the heckman to death) to figure out which observations are dropped and why?
Thanks!!!
Comment