Control function model and Heckman selection model

Alex Carr

Join Date: May 2017

Posts: 38
#1

Control function model and Heckman selection model

30 Aug 2018, 13:43

Hello,

I want to examine the effect of a state program (dummy variable D) on the unemployment rate (Y) . My observation unit is city i.

for OLS, I'd run :
Y_i =α_i +βD_i+Xλ +e_i

However, this program is generally located in cities that tend to have a lower unemployment rate. so D is not random and we have a selection bias problem.

Now I can, of course, do the traditional IV approach by finding an instrument of D.

However, I've been told that I could also address this selection bias with Heckman selection model or control function approach.

So first,

(1) do a probit model first to predict the likelihood the city i is being selected to have the program D.

D^*_i=Zσ +e_i (1)

(2) do an OLS with selectivity variable included in the second stage

Y_i =α_i +β'D +Xλ +τ Selectivity + e_i(2)

and my questions are that

(1) is this approach called Heckman correction approach or control function approach?

(3) I know Heckman approach tends to require variables for D=0 are unobservable(Censoring?) . But in my case, variables for D=0 are observable. Can I still use this approach?

(4) Lastly, is the Selectivity variable different than Inverse mill ratio? or is the Selectivity variable just another name for Generalized Residual from a control function approach.

(5) if τ is significantly different from zero, does that mean I have the evidence of endogeneity and I have to report β' which is consistent and unbiased?

Thank you very much.

Alex
Tags: None
Shruthi Venkatesh

Join Date: Aug 2016

Posts: 17
#2

30 Aug 2018, 15:19

I'm not sure why you would want to use the Heckman selection approach here since you observe unemployment rates for non-treated states. Modeling selection here is unnecessary. The problem with your OLS specification, however, is that there is no variation apart from pre- and post- treatment, so beta cannot be identified. wrt your questions --

(1) Control function methods are a more general class of methods. Heckman selection is a specific type of control function modeling.
(3) Heckman selection operates on the assumption that you do not observe unemployment rates for the untreated states. I wouldn't use Heckman selection here
(4) The selectivity variable is just the inverse mills ratio
(5) Yes. If you were to run Heckman selection, I would report both beta and tau with their standard errors -- and let the reader decide
Comment
Alex Carr

Join Date: May 2017

Posts: 38
#3

30 Aug 2018, 17:06

Originally posted by Shruthi Venkatesh View Post

(3) Heckman selection operates on the assumption that you do not observe unemployment rates for the untreated states. I wouldn't use Heckman selection here

Thanks, Shruthi. What I proposed is a control function (-treatreg), not Heckman approach(-heckman). In my case I do observe unemployment rates for both treated and non-treated cities but I can still employ this control function approach that is shown above right?

(4) The selectivity variable is just the inverse mills ratio

I think the selectivity variables are the Generalized residuals, not the inverse mills ratio. I need someone to confirm this please.

Alex
Comment
Alex Carr

Join Date: May 2017

Posts: 38
#4

03 Sep 2018, 21:29

can anyone help me with this please?
Comment
Roman Li

Join Date: Sep 2021

Posts: 6
#5

02 Oct 2021, 15:15

Hi Alex,

I have worked on one project using control function approach. What I can confirm you, to my best knowledge, is that:

1. I don't think that the procedures you are working here is a control function (CF) approach or Heckman two-stage model. I agree with Shruthi that you should prefer CF in your case because you observe Y (unemployment rate) in both treated and untreated states.

3&4. The e_i in your first stage can not be considered as the generalized residuals. Based on your presentation in equation 1, I guess that is just normal residuals in your probit estimation. But, generalized residuals (GR) from probit estimation requires the presence of Inverse Mills Ratio (IMR), by using the formala: GR_i=D_i.IMR.(Zσ)-(1-D_i).IMR.(Zσ) (In practice, we only obtain the predicted values of GR). That's why I argue that your procedure is neither CF nor Heckman two-stage model (which requires the presence of IMR only in the second stage).

Besides, these links may be useful to calculate IMR, if you wish:
https://www.stata.com/support/faqs/s...s/mills-ratio/
https://www.statalist.org/forums/for...se-mills-ratio

Hope it helpful,

Cheers,
Roman
Comment

Announcement

Control function model and Heckman selection model

Comment

Comment

Comment

Comment