Dear Statalist,
I want to estimate a multinomial logit model where one of the predictors is likely endogenous. I thought of pursuing a control function approach instrumenting the endogenous predictor in a first step using OLS, and then including the residuals in the second-step, multinomial response model (as described in Petrin and Train 2010, cp. Wooldrigde 2010).
I wonder, however, how to best estimate the standard errors. The recommended way of doing so seems to be to use bootstrapping, but I am not sure whether the procedure below is correct.
As a – heavily constructed – MWE, imagine that in the auto data set, repair record (rep78, recoded as y) is a nominal response variable, weight (recoded as x) is our endogenous predictor, length (recoded as z) is the instrument, and headroom is an additional exogenous predictor. What I have in mind is to do the following:
My question is: will this procedure (i.e. bootstrapping the whole program) lead to a reasonable estimate of the standard errors for the multinomial logit model?
Any help would be much appreciated,
Max
Refs:
Petrin, Amil, and Kenneth Train. 2010. ‘A Control Function Approach to Endogeneity in Consumer Choice Models’. Journal of Marketing Research 47 (1): 3–13.
Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data. Cambridge; London: The MIT Press.
I want to estimate a multinomial logit model where one of the predictors is likely endogenous. I thought of pursuing a control function approach instrumenting the endogenous predictor in a first step using OLS, and then including the residuals in the second-step, multinomial response model (as described in Petrin and Train 2010, cp. Wooldrigde 2010).
I wonder, however, how to best estimate the standard errors. The recommended way of doing so seems to be to use bootstrapping, but I am not sure whether the procedure below is correct.
As a – heavily constructed – MWE, imagine that in the auto data set, repair record (rep78, recoded as y) is a nominal response variable, weight (recoded as x) is our endogenous predictor, length (recoded as z) is the instrument, and headroom is an additional exogenous predictor. What I have in mind is to do the following:
Code:
sysuse auto, clear rename rep78 y rename weight x rename length z * Control function approach: reg x z headroom // first stage predict x_res, residuals // cf mlogit y x headroom x_res //second stage * Now, to correct for the uncertainty introduced by estimating the first stage, I thought of doing: program bsses reg x z headroom predict x_res, residuals mlogit y x headroom x_res drop x_res end program bootstrap, reps(1000): bsses
My question is: will this procedure (i.e. bootstrapping the whole program) lead to a reasonable estimate of the standard errors for the multinomial logit model?
Any help would be much appreciated,
Max
Refs:
Petrin, Amil, and Kenneth Train. 2010. ‘A Control Function Approach to Endogeneity in Consumer Choice Models’. Journal of Marketing Research 47 (1): 3–13.
Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data. Cambridge; London: The MIT Press.
Comment