Bug/overcorrection with post-Stata 14.1 margins command after ivprobit?

Zhenkai Ran

Join Date: Jul 2020
Posts: 10

#16

24 Jul 2020, 08:34

Originally posted by Enrique Pinzon (StataCorp) View Post

Dear Jeff and Joe,

Just for clarity, the predicted probability formula is correct. What Jeff
refers to is the computation of effects based on that formula.

Calculating average partial effects based on this probability formula gives a
statistic that does not have an average structural function interpretation,
because we are perturbing both observables and unobservables. What Jeff wants,
and is more useful to ask questions about counterfactuals, is to integrate over
the unobservables and then compute average partial effects.

You can get the predictions Jeff and Joe want using -eprobit-. We are working
on adding these predictions to -ivprobit-, it is in a beta version that is in
your code currently but not documented. -eprobit- gives you equivalent point
estimates to ivprobit but allows you more postestimation options. -margins-
after -eprobit- computes an average structural function and allows you other
options to handle the unobservable components. It also allows for multiple
endogenous equations with different instruments. Here is a simulated data
example of how to get the effects for the average structural function.

Code:

. clear

. set seed 123

. set obs 5000
number of observations (_N) was 0, now 5,000

. gen id = _n

. mat cov = (1,-0.5\-0.5,4)

. drawnorm e1 e2, cov(cov)

. gen x1 = runiform()

. gen x2 = runiform()

. gen z1 = runiform()

. gen z2 = runiform()

. gen zb = 1 + z1 + z2

. gen y2 = zb + e2

. gen xb = -1.5 + x1 + x2 + 0.75*y2

. gen y1 = (xb + e1) > 0

. eprobit y1 x1 x2, endog(y2 = x1 x2 z1 z2)

Iteration 0: log likelihood = -12274.567
Iteration 1: log likelihood = -12274.567

Extended probit regression Number of obs = 5,000
Wald chi2(3) = 592.69
Log likelihood = -12274.567 Prob > chi2 = 0.0000

---------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------+----------------------------------------------------------------
y1 |
x1 | 1.044111 .0887834 11.76 0.000 .8700984 1.218123
x2 | 1.040255 .0890415 11.68 0.000 .8657372 1.214774
y2 | .7476604 .0497003 15.04 0.000 .6502497 .8450712
_cons | -1.525274 .1211866 -12.59 0.000 -1.762796 -1.287753
----------------+----------------------------------------------------------------
y2 |
x1 | -.0647558 .097615 -0.66 0.507 -.2560777 .1265662
x2 | -.1312252 .0986173 -1.33 0.183 -.3245114 .0620611
z1 | .8950109 .0980193 9.13 0.000 .7028967 1.087125
z2 | .857016 .0985302 8.70 0.000 .6639004 1.050132
_cons | 1.213411 .1019993 11.90 0.000 1.013496 1.413326
----------------+----------------------------------------------------------------
var(e.y2)| 3.98964 .0797928 3.836274 4.149137
----------------+----------------------------------------------------------------
corr(e.y2,e.y1)| -.2022677 .1319464 -1.53 0.125 -.4420192 .0644564
---------------------------------------------------------------------------------

. margins, dydx(y2) predict(pr fix(y2))

Average marginal effects Number of obs = 5,000
Model VCE : OIM

Expression : Pr(y1==1), predict(pr fix(y2))
dy/dx w.r.t. : y2

------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
y2 | .1371811 .0057121 24.02 0.000 .1259857 .1483766
------------------------------------------------------------------------------

This quantity has an average structural function interpretation. -margins- is
integrating over the distribution of the unobservable.

In the current version of -ivprobit- you would type:

Code:

. ivprobit y1 x1 x2 (y2 = z1 z2)

Fitting exogenous probit model

Iteration 0: log likelihood = -2946.6422
Iteration 1: log likelihood = -1784.0064
Iteration 2: log likelihood = -1721.2204
Iteration 3: log likelihood = -1720.6217
Iteration 4: log likelihood = -1720.6216

Fitting full model

Iteration 0: log likelihood = -12274.568
Iteration 1: log likelihood = -12274.567
Iteration 2: log likelihood = -12274.567

Probit model with endogenous regressors Number of obs = 5,000
Wald chi2(3) = 592.69
Log likelihood = -12274.567 Prob > chi2 = 0.0000

---------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------+----------------------------------------------------------------
y2 | .7476604 .0497003 15.04 0.000 .6502496 .8450712
x1 | 1.044111 .0887834 11.76 0.000 .8700984 1.218123
x2 | 1.040255 .0890415 11.68 0.000 .8657372 1.214774
_cons | -1.525274 .1211866 -12.59 0.000 -1.762796 -1.287753
----------------+----------------------------------------------------------------
corr(e.y2,e.y1)| -.2022676 .1319464 -.4420192 .0644565
sd(e.y2)| 1.997408 .0199741 1.958641 2.036943
---------------------------------------------------------------------------------
Instrumented: y2
Instruments: x1 x2 z1 z2
---------------------------------------------------------------------------------
Wald test of exogeneity (corr = 0): chi2(1) = 2.22 Prob > chi2 = 0.1360

. margins, dydx(y2) predict(pr fix(y2))

Average marginal effects Number of obs = 5,000
Model VCE : OIM

Expression : Probability of positive outcome, predict(pr fix(y2))
dy/dx w.r.t. : y2

------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
y2 | .1371811 .0057121 24.02 0.000 .1259856 .1483766
------------------------------------------------------------------------------

The syntax for the estimation may change but the option is currently there.
Joe, thanks again for the data and inquiries. We are working in adding more
predictions to -ivprobit-. Thanks to Jeff too.

Thanks very much for your comment, Enrique! It is very helpful. But I still have some doubts: I thought the average marginal effect utilizing the average structural equation is only used when we estimate our IV Probit via a two-step method. In your example, you have used MLE instead?
Another minor question I have is: what is the effect of adding fix() inside the predict() option? Is it used to fix the variables inside the bracket at their means? If I am trying to estimate the marginal effects for all explanatory variables, should I put all my variables inside the fix()?

Any comment is highly appreciated!

Comment

Enrique Pinzon (StataCorp)

StataCorp Employee

Join Date: Jan 2015

Posts: 214
#17

24 Jul 2020, 10:56

Dear Zhenkai,

What you get with fixed is G(XB) where G is the Normal CDF. This is what Jeff calls the average structural function on his textbook. These are the betas of the original model. To use fix()
put the endogenous covariates in the model inside fix().

Jeff also suggests a two step method that averages out G(XB +u) where u is estimated in a first stage. If you average out this quantity and take effects, you can get an average structural function too. By the way, you could get this quantity also typing -margins, predict(pr base(y2=y2T)) -. Here you need to create a copy of the endogenous variable beforehand, y2T is my copy. We will provide both predictions soon, I am working really hard to fix this.
Comment
Zhenkai Ran

Join Date: Jul 2020

Posts: 10
#18

24 Jul 2020, 17:50

Originally posted by Enrique Pinzon (StataCorp) View Post

Dear Zhenkai,

What you get with fixed is G(XB) where G is the Normal CDF. This is what Jeff calls the average structural function on his textbook. These are the betas of the original model. To use fix()
put the endogenous covariates in the model inside fix().

Jeff also suggests a two step method that averages out G(XB +u) where u is estimated in a first stage. If you average out this quantity and take effects, you can get an average structural function too. By the way, you could get this quantity also typing -margins, predict(pr base(y2=y2T)) -. Here you need to create a copy of the endogenous variable beforehand, y2T is my copy. We will provide both predictions soon, I am working really hard to fix this.

Thanks very much for the clarification, Enrique! This is very helpful!!!
Comment
oliver wei

Join Date: Dec 2022

Posts: 16
#19

19 May 2023, 07:16

Originally posted by Enrique Pinzon (StataCorp) View Post

Dear Zhenkai,

What you get with fixed is G(XB) where G is the Normal CDF. This is what Jeff calls the average structural function on his textbook. These are the betas of the original model. To use fix()
put the endogenous covariates in the model inside fix().

Jeff also suggests a two step method that averages out G(XB +u) where u is estimated in a first stage. If you average out this quantity and take effects, you can get an average structural function too. By the way, you could get this quantity also typing -margins, predict(pr base(y2=y2T)) -. Here you need to create a copy of the endogenous variable beforehand, y2T is my copy. We will provide both predictions soon, I am working really hard to fix this.

Very helpful! Thanks. It looks like Stata 18 does not address this issue.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2430
#20

19 May 2023, 07:30

Hi Olivier,
I believe it does. (Stata rarely makes the same mistake after correcting it)
And after this post, and some exchange I had with them, they fixed it back in Stata16.
In any case, if you think there is an error, you can do the work replicating it yourself.
I do that in this post, where I explained the problem, the solution, and how they fixed it.

https://friosavila.github.io/stata_do/stata_do4.html

F
1 like
Comment
oliver wei

Join Date: Dec 2022

Posts: 16
#21

02 Jun 2023, 08:25

Originally posted by FernandoRios View Post

Hi Olivier,
I believe it does. (Stata rarely makes the same mistake after correcting it)
And after this post, and some exchange I had with them, they fixed it back in Stata16.
In any case, if you think there is an error, you can do the work replicating it yourself.
I do that in this post, where I explained the problem, the solution, and how they fixed it.

https://friosavila.github.io/stata_do/stata_do4.html

F

thanks man, i have a project using iv probit and I follow the discussion of this thread to predict the probability. Thanks so much!
Comment
oliver wei

Join Date: Dec 2022

Posts: 16
#22

02 Jun 2023, 08:46

Originally posted by FernandoRios View Post

Hi Olivier,
I believe it does. (Stata rarely makes the same mistake after correcting it)
And after this post, and some exchange I had with them, they fixed it back in Stata16.
In any case, if you think there is an error, you can do the work replicating it yourself.
I do that in this post, where I explained the problem, the solution, and how they fixed it.

https://friosavila.github.io/stata_do/stata_do4.html

F

In my setting, I have to use two fixed effects in my equation. This is my command.

margins, dydx(treatment) predict(pr fix(treatment)) nose

I see on your website you did not fix. Could you explain a bit more?

Thanks in advance.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2430
#23

02 Jun 2023, 09:16

I did not use fix because I implemented this differently from what they did in Stata officially
also doing this with fixed effects is tricky because fixed effecys And nonlinear models do not go along well
hth
fernando
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment