Hey everyone,
I am trying to implement a variant of the Altonji, Elder, Taber selection on observables paper using either Krauth's package rcr or Emily Oster's recent package psacalc. As a general note, these are fantastic packages and I'd encourage others to use them more actively -- partial identification is a really neat way of doing robustness on issues that are tough to answer without a silver bullet.
My baseline model contains fixed effects and IVs, so it is a little more nuanced than the standard application. Lets take the simple example of:
reg y on x1, x2, x3, using z as an IV
For the moment, assume that x3 is a big vector of controls -- rcr only allows 25 controls, so one thing we might do is demean all the variables of x3, i.e. reg y on x3, take y_resid, x1 on x3, take x1_resid, and the same for x2 and z. Now, since we still want to incorporate the IV, we might take a control function approach where we regress y_resid on z_resid, get vhat. So the final application of rcr should be:
rcr y_resid x1_resid x2_resid vhat,lambda(0.1)
However, that gives an error about being unable to calculate the standard errors for thetaL/H (and suggests normalizing to zero). However, the variables are all in common units, so that should not be a problem.
When trying the psacalc package, I got the error "model does not have a constant". I specifically used "areg" to run a regression first, then called "psacalc beta x1,lambda(0.1)". The weird part is that the first-stage regression does have a constant. Now, if I residualize the variables like in the rcr case, then run "reg" (instead of "areg") followed by psacalc, the estimator works -- but it is odd that it does not work after areg. I am also a little confused by the output, but I dont see a description of the output in the help package. Are the "alt sol1,2" lines the upper and lower bounds?
Thank you!
I am trying to implement a variant of the Altonji, Elder, Taber selection on observables paper using either Krauth's package rcr or Emily Oster's recent package psacalc. As a general note, these are fantastic packages and I'd encourage others to use them more actively -- partial identification is a really neat way of doing robustness on issues that are tough to answer without a silver bullet.
My baseline model contains fixed effects and IVs, so it is a little more nuanced than the standard application. Lets take the simple example of:
reg y on x1, x2, x3, using z as an IV
For the moment, assume that x3 is a big vector of controls -- rcr only allows 25 controls, so one thing we might do is demean all the variables of x3, i.e. reg y on x3, take y_resid, x1 on x3, take x1_resid, and the same for x2 and z. Now, since we still want to incorporate the IV, we might take a control function approach where we regress y_resid on z_resid, get vhat. So the final application of rcr should be:
rcr y_resid x1_resid x2_resid vhat,lambda(0.1)
However, that gives an error about being unable to calculate the standard errors for thetaL/H (and suggests normalizing to zero). However, the variables are all in common units, so that should not be a problem.
When trying the psacalc package, I got the error "model does not have a constant". I specifically used "areg" to run a regression first, then called "psacalc beta x1,lambda(0.1)". The weird part is that the first-stage regression does have a constant. Now, if I residualize the variables like in the rcr case, then run "reg" (instead of "areg") followed by psacalc, the estimator works -- but it is odd that it does not work after areg. I am also a little confused by the output, but I dont see a description of the output in the help package. Are the "alt sol1,2" lines the upper and lower bounds?
Thank you!
Comment