Dear Statalist community,
I am working with a co-author on a paper where our independent variable of interest is most likely endogenous. Specifically, as part of the project, we analyze the relationship between the degree of investment in human capital (IV of interest) and firm performance (DV). We fear the degree of investment in human capital is endogenous because when firms perform very badly, they may choose to invest more in human capital.
Now to deal with this endogeneity problem, we had initially tried to conduct an IV/2SLS analysis, but we have not managed to find a proper instrument Z to conduct the IV/2SLS regression. Because of this failure to run an IV/2SLS regression, my co-author has now suggested to me looking into the Heckman produce – to alternatively deal with our endogeneity problem. I have read up on Heckman models now, but sadly I am not a big expert on econometrics at all – and so I am not sure whether I understood everything correctly.
Thus, would there be any chance of letting me know whether the following two statements are correct?
Thank you so much in advance for any feedback and advice on this issue, highly appreciated!
Best wishes,
Franz
I am working with a co-author on a paper where our independent variable of interest is most likely endogenous. Specifically, as part of the project, we analyze the relationship between the degree of investment in human capital (IV of interest) and firm performance (DV). We fear the degree of investment in human capital is endogenous because when firms perform very badly, they may choose to invest more in human capital.
Now to deal with this endogeneity problem, we had initially tried to conduct an IV/2SLS analysis, but we have not managed to find a proper instrument Z to conduct the IV/2SLS regression. Because of this failure to run an IV/2SLS regression, my co-author has now suggested to me looking into the Heckman produce – to alternatively deal with our endogeneity problem. I have read up on Heckman models now, but sadly I am not a big expert on econometrics at all – and so I am not sure whether I understood everything correctly.
Thus, would there be any chance of letting me know whether the following two statements are correct?
- (1) Heckman models are used to resolve selection bias, where there is missing data in the DV. For example, when studying different factors that affect wages, all the observed data for the DV “wages” are for people that are actually working and that have accepted a job offer. To deal with this situation of selection bias, in a Heckman model there is always a first-stage limited DV (e.g., the binary variable of whether someone is actually working or not). Is this statement correct?
Also, based on the above, it seems that Heckman models are not suited for our specific research project? (Where we are looking at the effect of the degree of investment in human capital on firm performance; and the endogeneity problem arises from a reverse causality issue.)
- (2) Heckman models also require instruments, just as IV/2SLS regressions do – so if we don’t manage to find a proper instrument for an IV/2SLS regression, we would most likely also not find a proper instrument to carry out a Heckman procedure?
Thank you so much in advance for any feedback and advice on this issue, highly appreciated!
Best wishes,
Franz