Dear Statalists,
I am struggling in addressing sample selection bias for my focal variable, which is a dummy. My study is to investigate the effects of a technology T on firm performance. In doing so, I have constructed a dummy variable DT as a proxy for the adoption: 1 if the firm adopt technology T and 0 otherwise. In fact, I have gone through firms' annual reports to check their adoption. The selection thus include sample selection bias because firm does not confirm "I don't adopt/use T in my company". In this sense, DT=0 presents missing values rather than "NOT" adopted.
I have tried to use control function approach (by adding generalized residuals in the regression). But the reviewers ask me to use inverse Mill ratio, by regressing a probit model, for further analysis (But they did not explain so much on this). I have took a look at Heckman two-stage procedure but it seems to exclude DT at the last phase.
Can I hear something from you all to address this issue? These are what I have done by using control function approach:
1. Regress the probit model for DT: Pr(DT=1!X,Z) = a + biXit + ciZit + ei
2. Calculate generalized residuals GR suggested by Wooldridge (2015)
3. Add GR as a new variable in the baseline model: Yit=a + biXit + c*DT + d*GR + eit
My focus is c
Thanks so much for your time!
I am struggling in addressing sample selection bias for my focal variable, which is a dummy. My study is to investigate the effects of a technology T on firm performance. In doing so, I have constructed a dummy variable DT as a proxy for the adoption: 1 if the firm adopt technology T and 0 otherwise. In fact, I have gone through firms' annual reports to check their adoption. The selection thus include sample selection bias because firm does not confirm "I don't adopt/use T in my company". In this sense, DT=0 presents missing values rather than "NOT" adopted.
I have tried to use control function approach (by adding generalized residuals in the regression). But the reviewers ask me to use inverse Mill ratio, by regressing a probit model, for further analysis (But they did not explain so much on this). I have took a look at Heckman two-stage procedure but it seems to exclude DT at the last phase.
Can I hear something from you all to address this issue? These are what I have done by using control function approach:
1. Regress the probit model for DT: Pr(DT=1!X,Z) = a + biXit + ciZit + ei
2. Calculate generalized residuals GR suggested by Wooldridge (2015)
3. Add GR as a new variable in the baseline model: Yit=a + biXit + c*DT + d*GR + eit
My focus is c
Thanks so much for your time!