Partial identification using psacalc or rcr

Christos Makridis

Join Date: Nov 2014

Posts: 157
#1

Partial identification using psacalc or rcr

25 Jan 2017, 11:21

Hey everyone,

I am trying to implement a variant of the Altonji, Elder, Taber selection on observables paper using either Krauth's package rcr or Emily Oster's recent package psacalc. As a general note, these are fantastic packages and I'd encourage others to use them more actively -- partial identification is a really neat way of doing robustness on issues that are tough to answer without a silver bullet.

My baseline model contains fixed effects and IVs, so it is a little more nuanced than the standard application. Lets take the simple example of:

reg y on x1, x2, x3, using z as an IV

For the moment, assume that x3 is a big vector of controls -- rcr only allows 25 controls, so one thing we might do is demean all the variables of x3, i.e. reg y on x3, take y_resid, x1 on x3, take x1_resid, and the same for x2 and z. Now, since we still want to incorporate the IV, we might take a control function approach where we regress y_resid on z_resid, get vhat. So the final application of rcr should be:

rcr y_resid x1_resid x2_resid vhat,lambda(0.1)

However, that gives an error about being unable to calculate the standard errors for thetaL/H (and suggests normalizing to zero). However, the variables are all in common units, so that should not be a problem.

When trying the psacalc package, I got the error "model does not have a constant". I specifically used "areg" to run a regression first, then called "psacalc beta x1,lambda(0.1)". The weird part is that the first-stage regression does have a constant. Now, if I residualize the variables like in the rcr case, then run "reg" (instead of "areg") followed by psacalc, the estimator works -- but it is odd that it does not work after areg. I am also a little confused by the output, but I dont see a description of the output in the help package. Are the "alt sol1,2" lines the upper and lower bounds?

Thank you!
Tags: None
Jorge Eduardo Perez Perez

Join Date: Mar 2014

Posts: 429
#2

25 Jan 2017, 11:52

Hi Christos.

The error from psacalc is happening because somehow your -areg- regression is not returning a constant in the output, or, the constant is not named "_cons", Can you show us exactly what you typed, or an example of this error with a Stata dataset?

"Alt sol" are alternative solutions of the estimate, for the given amount of selection on unobservables. From the helpfile:
" When there are multiple solutions, the default is to choose the solution that minimizes the distance to the estimated treatment effect in the controlled regression and does not change the direction of the bias (See Assumption 3 in Oster (2016))"

Jorge Eduardo Pérez Pérez
www.jorgeperezperez.com
1 like
Comment
Christos Makridis

Join Date: Nov 2014

Posts: 157
#3

25 Jan 2017, 12:13

Hey Jorge!

On the first point, that's odd since when I see the reg results from "areg", the constant is named its normal name "_cons". The alternative (which I did) is to residualize the variables of the fixed effects and then just do the second stage regression without them, then do psacalc.

So my understanding of the alt sol being the lower/upper bounds is right then?
Comment
Jorge Eduardo Perez Perez

Join Date: Mar 2014

Posts: 429
#4

25 Jan 2017, 12:29

In that case it may be a bug in the command. Can you show us exactly what you typed, or reproduce the error with an example dataset?

I don't think the bounds interpretation is correct. -psacalc beta- produces an estimate of the treatment effect under selection. This estimate may not be unique though: it is the solution of a cubic equation so there may be three solutions. If you assume that the unobservables do not change the direction of the bias, then the solution is unique, and that's the first solution that -psacalc- shows. The others are alternative solutions. But they should not be interpreted as bounds, as the estimate should be one of the three solutions, not a number in the range defined by them.

Jorge Eduardo Pérez Pérez
www.jorgeperezperez.com
Comment
Christos Makridis

Join Date: Nov 2014

Posts: 157
#5

25 Jan 2017, 12:33

Yeah sure, here are screen shots of the regression (top and bottom part so you can see the whole thing more or less).

Got it on your point, so where are the bounds then?
Attached Files
Comment
Jorge Eduardo Perez Perez

Join Date: Mar 2014

Posts: 429
#6

25 Jan 2017, 12:51

I can not replicate this error with the example in the helpfile.

Code:

webuse nlswork glo vars "tenure" areg ln_w grade age c.age#c.age ttl_exp c.ttl_exp#c.ttl_exp $vars 2.race not_smsa south i.year [aw=hours], absorb(idcode) cluster(idcode) psacalc beta south, delta(0.1)

Can you replicate it with a small subset of your data and post it with -dataex-?

To calculate bounds, you should calculate the estimate for several values of -delta-. Then those estimates are bounds of beta as the proportional degree of selection varies.

Jorge Eduardo Pérez Pérez
www.jorgeperezperez.com
Comment
Christos Makridis

Join Date: Nov 2014

Posts: 157
#7

25 Jan 2017, 12:56

Hmm, yeah I can try.

I realize the delta gauges the selection, but I guess I was expecting to see an upper/lower bound like with Krauth's rcr package. So, there is nothing in the results that give an exact upper/lower bound, it is more meant as a heuristic to get different exact point estimates (or multiple ones if not a unique solution) under different deltas?
Comment
Jorge Eduardo Perez Perez

Join Date: Mar 2014

Posts: 429
#8

25 Jan 2017, 15:16

The paper (page 18) argues that the beta values for delta=0 and delta=1 are adequate bounds. As you change delta the point estimates you get would be between those two values.

Jorge Eduardo Pérez Pérez
www.jorgeperezperez.com
Comment
Christos Makridis

Join Date: Nov 2014

Posts: 157
#9

25 Jan 2017, 23:52

Thanks Jorge, I think I got it now. Yeah you must be pretty familiar with it since you know her from the department!
Comment
Jorge Eduardo Perez Perez

Join Date: Mar 2014

Posts: 429
#10

26 Jan 2017, 09:33

I helped with the code in -psacalc-! That's why I'm concerned about the bug, I want to fix it!

Jorge Eduardo Pérez Pérez
www.jorgeperezperez.com
Comment
Vrinda Kapoor

Join Date: Feb 2017

Posts: 1
#11

16 Feb 2017, 04:39

Hi Jorge,

I tried -psacalc- with areg too and encountered the same error ("Model does not have a constant") even when the regression does return a constant term (_cons). However, when I tried again with only one control, it worked! I encountered the same problem when I tried running the model with 'reg' and a lot of independent variables.

Do you have any idea what might be causing this?

Thanks,
Vrinda
Comment

Announcement

Partial identification using psacalc or rcr

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment