Heckman MLE and convergence issue

Remi Odry

Join Date: Apr 2018

Posts: 10
#1

Heckman MLE and convergence issue

06 Jan 2020, 03:57

Hello Everyone,

I am currently trying to estimate a gravity model for migrations and face a problem in the estimation of this one with the Heckman MLE (and two step in a second time).
When running it, Stata run the code but I have nothing more after "note: 33.origin#18.years omitted because of collinearity". I don't know if it difficulty converge or if it is taken into a "loop" or anything else...
Please find in attached files the .do to replicate the code. (I am currently struggling to import the .dta files)

Attached Files

Migration_test.do (2.0 KB, 1 view)
Tags: None
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#2

06 Jan 2020, 06:07

Dear Remi Odry,

Well, I do not think the estimator that you are using is suitable in the case of gravity equations, so you may want to consider a different approach such as PPML. Anyway, you probably have perfect predictors in the binary part of the model and that would cause convergence problems.

Best wishes,

Joao
Comment

Remi Odry

Join Date: Apr 2018
Posts: 10

07 Jan 2020, 03:20

Dear Joao Santos Silva,

Thank you very much for your answer !
I have previously used the PPML approach for this case. Unfortunately, I found the results very suspicious in my case.

Here are results from the PPML code

Code:

 inflows
Coef. Std.
Err. z
P>z [95%
Conf. Interval]





ln_gdp
.0390131
.1055423
0.37
0.712
-.1678459 .2458722

ln_gdp_square
-.0266318
.0205439
-1.30
0.195
-.0668972 .0136335

ln_unemploy
-1.045613
.15012
-6.97
0.000
-1.339843 -.7513833

ln_pop
1.021717
.1129112
9.05
0.000
.8004154 1.243019

eurozone_des
.5547825
.1444433
3.84
0.000
.2716788 .8378862

eu_member_des
-.5040836
.2143855
-2.35
0.019
-.9242715 -.0838956

schengen_des
-.0556169
.1378951
-0.40
0.687
-.3258863 .2146525

comlang_off
1.533532
.2560315
5.99
0.000
1.03172 2.035345

ln_dist
-.5517982
.1492992
-3.70
0.000
-.8444194 -.2591771

bologna_gg2_des
-1.422009
.4506576
-3.16
0.002
-2.305282 -.5387361

bologna_gg3_des
-.3703938
.254294
-1.46
0.145
-.8688008 .1280132

bologna_gg4_des
-.5591961
.2330512
-2.40
0.016
-1.015968 -.1024242

bologna_gg5_des
-.5607078
.2370102
-2.37
0.018
-1.025239 -.0961762

and here are the results from OLS estimation:

Code:

 ln_inflows
Coef. Std. Err.
t P>t
[95% Conf. Interval]





ln_gdp
.1534903 .0517219
2.97
0.003
.0518432 .2551374

ln_gdp_square
-.0243524 .0107893
-2.26
0.024
-.0455562 -.0031486

ln_unemploy
-.8825142 .0891559
-9.90
0.000
-1.057729 -.7072995

ln_pop
.8192929 .0381854
21.46
0.000
.7442485 .8943372

eurozone_des
.8750028 .1132806
7.72
0.000
.6523768 1.097629

eu_member_des
-.6769719 .1136721
-5.96
0.000
-.9003672 -.4535765

schengen_des
.3361431 .1063147
3.16
0.002
.127207 .5450793

comlang_off
1.110128 .1951285
5.69
0.000
.7266494 1.493606

ln_dist
-.8156473 .0885506
-9.21
0.000
-.9896723 -.6416222

bologna_gg2_des
3.567481 .6415462
5.56
0.000
2.306674 4.828287

bologna_gg3_des
4.597321 .6011356
7.65
0.000
3.415932 5.778709

bologna_gg4_des
4.793648 .5984072
8.01
0.000
3.617621 5.969675

bologna_gg5_des
4.89075 .5981438
8.18
0.000
3.715241 6.066259

and SOLS estimation :

Code:

 ln_inflows_sc~d
Coef.
Std. Err.
t
P>t
[95% Conf.
Interval]






ln_gdp
.0742904
.0594863
1.25
0.212
-.0426158
.1911965

ln_gdp_square
-.0318696
.0123032
-2.59
0.010
-.0560485
-.0076906

ln_unemploy
-.9725577
.102252
-9.51
0.000
-1.17351
-.7716059

ln_pop
.6131845
.0525219
11.67
0.000
.5099651
.7164038

eurozone_des
.3312994
.144242
2.30
0.022
.0478262
.6147727

eu_member_des
-.4518962
.1289618
-3.50
0.001
-.7053398
-.1984526

schengen_des
.2362816
.1342884
1.76
0.079
-.0276303
.5001934

comlang_off
1.064407
.3174721
3.35
0.001
.4404916
1.688323

ln_dist
-.6935735
.1234246
-5.62
0.000
-.9361351
-.451012

bologna_gg2_des
3.303234
.7206294
4.58
0.000
1.887009
4.719459

bologna_gg3_des
4.467632
.6751673
6.62
0.000
3.140752
5.794512

bologna_gg4_des
4.063174
.6810316
5.97
0.000
2.724769
5.401579

bologna_gg5_des
4.396473
.6764891
6.50
0.000
3.066995
5.725951

The main issue is with the results for the bologna indicators. The Heckman MLE gives close results to the OLS and SOLS ones, while the PPML gives counter-intuitive results.

Comment

Remi Odry

Join Date: Apr 2018
Posts: 10

07 Jan 2020, 03:26

The code used for the previous results is :

Code:

*Control

*OLS
regress ln_inflows ln_gdp ln_gdp_square ln_unemploy ln_pop eurozone_des eu_member_des schengen_des  comlang_off ln_dist bologna_gg2_des bologna_gg3_des bologna_gg4_des bologna_gg5_des i.origin#i.years, robust cluster(dist)

*Scaled OLS

regress ln_inflows_scaled ln_gdp ln_gdp_square ln_unemploy ln_pop eurozone_des eu_member_des schengen_des  comlang_off ln_dist bologna_gg2_des bologna_gg3_des bologna_gg4_des bologna_gg5_des i.origin#i.years, robust cluster(dist)

*PPML

* creating origin_year dummies*

egen ori_year =  group(country_ori year)

quietly tabulate ori_year, gen (ori_year_d)

ppml inflows ln_gdp ln_gdp_square ln_unemploy ln_pop eurozone_des eu_member_des schengen_des  comlang_off ln_dist bologna_gg2_des bologna_gg3_des bologna_gg4_des bologna_gg5_des ori_year_d*, cluster(dist)

Comment

Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#5

07 Jan 2020, 07:52

Dear Remi Odry,

Maybe your model should have a different specification? Using an estimator that we know is invalid just because its results fit with our priors can be a dangerous practice.

Best wishes,

Joao
1 like
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2168
#6

07 Jan 2020, 08:31

Remi: You don't have a selection problem, so you want to be careful about referring to the Heckman method without being clear about that. You have a corner solution. A zero is a zero, right? It's not missing data.

In that light, you're trying to estimate a two-part (or hurdle) model. OLS with log(y) as the dependent variable for y > 0 is valid for the "amount" equation under Cragg's lognormal hurdle model. Then, use a probit to estimate P(y = 0|x). There is a way to then compute the "unconditional" expectation to get overall marginal effects. Those can be compared with the PPML estimates.

What you're trying to do with the Heckman approach is allow correlation between the two errors. This is not well identified and should not be used except in the case that you have a good exclusion restriction (rare in these cases). From my brief look at your do file (I gritted my teeth as I opened it), it appears your Heckman command does not have an exclusion restriction (something that determines whether one is at the corner versus not). Is that correct? If so, you should just use the Cragg two-part model.

I discuss these issues in my MIT Press book, 2010, second edition. There, I call the Heckman model the "exponential Type II Tobit model" to connect it to the literature. It is not a selection model, though, as you observe y always.
Comment
Remi Odry

Join Date: Apr 2018

Posts: 10
#7

22 Jan 2020, 08:16

Dear Joao Santos Silva ,

Thank you for your remark ! You are absolutely right and I am now revising the chosen specifications.

Dear Jeff Wooldridge,

Thank you also for your comment and your precious advices. I have spent the last days to focus on my dependent variable distribution.

the following picture is the frequency distribution of my dependent variable (inflows). As you can see it follows an over-dispersed Poisson distribution (but it can also follows a Negative binomial one ?)

the PPML estimator developed by Joao Santos Silva can deal with over-dispersion and the presence of excess zeros.

For the Cragg two-part model, I have carefully read the chapter about corner solutions (and I am still reading others really interesting). Excluding zeros (truncating the distribution) leads to the following distribution :

code :

Code:

histogram inflows if inflows >= 1, discrete percent fcolor(black) lcolor(black) normal ytitle(Frequency)

the lognormal distribution of the latent variable assumed in the Lognormal Hurdle (LH) model seems to "fit better" with the empirical distribution we can observe.
I hope my reasoning is correct.

Attached Files
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#8

23 Jan 2020, 05:43

Dear Remi Odry,

As Jeff explained, for this kind of data you have a choice between two-part models and single-equation models, of which PPML is an example (to be clear, I wrote the ppml command and advocated its use, but did not develop the estimator). For migration, it is standard to use PPML and ignore the overdispersion that is largely irrelevant in this context. Personally, I find it difficult to see how a hurdle model would be a good description of this kind of aggregate data (countries do not decide whether to send migrants to a given destination and then decide on the number of migrants to send).

Best wishes,

Joao
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2168
#9

25 Jan 2020, 04:18

I'll differ a bit from Joao here. It seems to me that, for historical reasons, and perhaps current policy emphasis, that the two-part hurdle model can be a good description of migration flows. Otherwise, what explains all of the zeros? Even if just interested in the mean effect, it might be sensible to model P(y > 0|x) and E(y|y>0,x) separately. One possibility is to use PPML conditional on y > 0 and a logit for P(y > 0|x). This is an alternative to the lognormal hurdle or truncated normal hurdle.

What happens when you compare the PPML estimates on the entire sample with those conditional on y > 0?
Comment
Cappuccia Leo

Join Date: Jun 2023

Posts: 6
#10

20 Jun 2023, 07:20

Originally posted by Jeff Wooldridge View Post

Remi: You don't have a selection problem, so you want to be careful about referring to the Heckman method without being clear about that. You have a corner solution. A zero is a zero, right? It's not missing data.

In that light, you're trying to estimate a two-part (or hurdle) model. OLS with log(y) as the dependent variable for y > 0 is valid for the "amount" equation under Cragg's lognormal hurdle model. Then, use a probit to estimate P(y = 0|x). There is a way to then compute the "unconditional" expectation to get overall marginal effects. Those can be compared with the PPML estimates.

What you're trying to do with the Heckman approach is allow correlation between the two errors. This is not well identified and should not be used except in the case that you have a good exclusion restriction (rare in these cases). From my brief look at your do file (I gritted my teeth as I opened it), it appears your Heckman command does not have an exclusion restriction (something that determines whether one is at the corner versus not). Is that correct? If so, you should just use the Cragg two-part model.

I discuss these issues in my MIT Press book, 2010, second edition. There, I call the Heckman model the "exponential Type II Tobit model" to connect it to the literature. It is not a selection model, though, as you observe y always.

Hello, sorry to open again this discussion but I just would like to know how we can interpret the coefficients from the "exponential Type II Tobit model" (Heckman with log(y) and an exclusion restriction) ? Can we directly interpret the coefficients from the second step as we would interpret coefficients from an OLS with log(y) ? I have read the course slides from Jeff Wooldridge that are very useful, but the interpretation using stata commands is still a bit tricky for me.

Thank you for your help and have a good day !
Comment

Announcement