Estimated coefficients are different but marginal effects are same in Probit and IV probit regression

karamat khan

Join Date: Feb 2019

Posts: 40
#1

Estimated coefficients are different but marginal effects are same in Probit and IV probit regression

22 Jul 2020, 08:59

Hello everyone
i used following command for probit regression and then estimated marginal effects

Code:

probit y x rd competition training lnage lnmanagerexp i.sml c1 c2 c3 i.isic i.ccode, vce(cluster ccode)

Code:

margins, dydx(x rd competition training lnage lnmanagerexp i.sml c1 c2 c3) post

then i used ivprobit where the instrumental variable is also a dummy, i added nose with marginal effect command becz it was taking long to time to give output

Code:

ivprobit y (x=iv) rd competition training lnage lnmanagerexp i.sml c1 c2 c3 i.isic i.ccode, vce(cluster ccode)

Code:

margins, dydx(x rd competition training lnage lnmanagerexp medium large c1 c2 c3) predict(pr) post nose

i found that there is big difference in estimated coefficients of both probit and (IV) probit regression but marginal effects are exactly same ? is it possible or im making some mistake? can someone please guide me
Tags: None
Chris Boudreaux

Join Date: Jul 2020

Posts: 83
#2

22 Jul 2020, 10:19

A few things:

1. It's not surprising that the coefficients are different. In the probit model, you are estimating the effect of x on y. In the IV probit model, you are instrumenting x with the iv in the first-stage regression. So you are estimating the effect of the predicted x (from the first stage regression) on y.
2. I'm not sure why the margins give the same output. I can only conjecture that the x is not the predicted value from the first stage regression in your second margins output.
3. It would be helpful for you to provide the exact output from your Stata. You can do that using the code delimiters, which are located with the # symbol in the command window. See the FAQ fore more details.
4. Probit and Iv probit are inconsistent the way you have specified them. You cannot include fixed effects like your ccode dummies because this gives rise to the incidental parameters problem. There are alternatives, however. You can specifiy as a random effects model, a correlated random effects model, or use the conditional logit model, 'clogit' or 'xtlogit, fe'. Of course, each has its own conditions as well.
5. You also cannot make the adjustment for heteroscedasticity-consistent standard errors with models like probit. If you suspect heteroscedasticity is a problem, you might consider hetprobit or another method.
Comment

karamat khan

Join Date: Feb 2019
Posts: 40

22 Jul 2020, 22:11

Dear sir, first of all i am really thankful for your kind response...output is here... i am also expecting that the coefficients are different in both probit and ivprobit so the marginal effects should also be different but this is not the case ... so i need guidance and want to correct if i am making any mistake... i am sorry if you feel that i am asking too much, can you please mention and write the correct specification for any of your suggested alternatives that i should use .. i am not as good in econometrics and stata... seeking your guidance... i dont know if its allowed to tag someone

coefficients from probit regression
command

Code:

probit y x rd competition training lnage lnmanagerexp i.sml c1 c2 c3 i.isic i.ccode, vce(cluster ccode)

Code:

------------------------------------------------------------------------------
             |               Robust
     y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        x |  -1.651873   .1414832   -11.68   0.000    -1.929175   -1.374571
          rd |   .7728412   .0647382    11.94   0.000     .6459567    .8997258
 competition |   .1665317    .025526     6.52   0.000     .1165017    .2165617
    training |   .4288081   .0377249    11.37   0.000     .3548687    .5027474
       lnage |   .0339876   .0125486     2.71   0.007     .0093927    .0585824
lnmanagerexp |   .0227266   .0171286     1.33   0.185    -.0108449     .056298
             |
         sml |
          2  |   .1609419   .0323035     4.98   0.000     .0976282    .2242556
          3  |   .2726596   .0456596     5.97   0.000     .1831685    .3621507
             |
     c1 |   .6409156   .0171823    37.30   0.000      .607239    .6745922
     c2 |   6.406295     .42924    14.92   0.000        5.565    7.247589
      c3 |  -.6384324   .0233147   -27.38   0.000    -.6841285   -.5927364
             |
        isic |
         
             |
       ccode |
        
       _cons |    16.7554   1.557085    10.76   0.000     13.70357    19.80723
------------------------------------------------------------------------------

Marginal effects command i used

Code:

margins, dydx(x rd competition training lnage lnmanagerexp i.sml c1 c2 c3) post

Code:

           |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        x |  -.5322573   .0436887   -12.18   0.000    -.6178855   -.4466291
          rd |   .2490205   .0193473    12.87   0.000     .2111004    .2869406
 competition |   .0536589   .0081844     6.56   0.000     .0376178       .0697
    training |   .1381681    .011718    11.79   0.000     .1152013    .1611349
       lnage |   .0109513   .0040458     2.71   0.007     .0030217    .0188808
lnmanagerexp |   .0073228    .005531     1.32   0.186    -.0035178    .0181635
             |
         sml |
          2  |   .0522896    .010672     4.90   0.000      .031373    .0732063
          3  |   .0891928   .0150413     5.93   0.000     .0597124    .1186732
             |
     c1 |   .2065122   .0054257    38.06   0.000      .195878    .2171464
     c2 |     2.0642   .1314386    15.70   0.000     1.806585    2.321815
      c3 |  -.2057121   .0068988   -29.82   0.000    -.2192335   -.1921906
------------------------------------------------------------------------------

ivprobit

Code:

ivprobit y (x=iv) rd competition training lnage lnmanagerexp i.sml c1 c2 c3 i.isic i.ccode, vce(cluster ccode)

Code:

-----------------------------------------------------------------------------------
                       |               Robust
                       |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
                  x |  -.1686717   .0073558   -22.93   0.000    -.1830889   -.1542546
                    rd |   .7724308   .0654058    11.81   0.000     .6442378    .9006238
           competition |   .1721729   .0261942     6.57   0.000     .1208332    .2235125
              training |   .4165138   .0340707    12.22   0.000     .3497364    .4832912
                 lnage |   .0210609   .0165228     1.27   0.202    -.0113233    .0534451
          lnmanagerexp |   .0245094   .0168329     1.46   0.145    -.0084825    .0575014
                       |
                   sml |
                    2  |    .155769   .0339549     4.59   0.000     .0892185    .2223194
                    3  |   .2561083   .0394717     6.49   0.000     .1787451    .3334715
                       |
               c1 |   .6110674   .0221032    27.65   0.000      .567746    .6543889
               c2 |   1.927669   .0619032    31.14   0.000     1.806341    2.048997
                c3|  -.5170284   .0194515   -26.58   0.000    -.5551526   -.4789041
                       |
                  isic |
                   
                 ccode |
                   
                       |
                 _cons |   .3489732   .0602852     5.79   0.000     .2308164    .4671299
-----------------------+----------------------------------------------------------------
 corr(e.x,e.y)|  -.1140433   .0491994                     -.2091107   -.0168403
             sd(e.x)|   .0774527   .0295963                      .0366248    .1637941
----------------------------------------------------------------------------------------
Instrumented:  x
Instruments:   rd competition training lnage lnmanagerexp 2.sml 3.sml c1 c2 c3
                 15.isic 16.isic 17.isic 18.isic 19.isic 20.isic 21.isic 22.isic
               23.isic 24.isic 25.isic 26.isic 27.isic 28.isic 29.isic 30.isic 31.isic
               32.isic 34.isic 35.isic 36.isic 37.isic 40.isic 45.isic 50.isic 51.isic
               52.isic 55.isic 60.isic 61.isic 62.isic 63.isic 64.isic 72.isic 74.isic
               5.ccode 6.ccode 8.ccode 9.ccode 11.ccode 12.ccode 13.ccode 14.ccode
               16.ccode 17.ccode 19.ccode 20.ccode 21.ccode 23.ccode 24.ccode 25.ccode
               27.ccode 28.ccode 29.ccode 31.ccode 32.ccode 33.ccode 34.ccode 35.ccode
               36.ccode 37.ccode 38.ccode 40.ccode 41.ccode 43.ccode 46.ccode
               iv
----------------------------------------------------------------------------------------
Wald test of exogeneity (corr = 0): chi2(1) = 5.28        Prob > chi2 = 0.0216

marginal effects

Code:

margins, dydx(x rd competition training lnage lnmanagerexp medium large c1 c2 c3) predict(pr) post nose

output

Code:

Average marginal effects                        Number of obs     =     20,513

Expression   : Probability of positive outcome, predict(pr)
dy/dx w.r.t. : x rd competition training lnage lnmanagerexp medium large c1 c2 c3

------------------------------------------------------------------------------
             |      dy/dx
-------------+----------------------------------------------------------------
        x |  -.5322573
          rd |   .2490205
 competition |   .0536589
    training |   .1381681
       lnage |   .0109513
lnmanagerexp |   .0073228
      medium |   .0518578
       large |   .0878548
     c1 |   .2065122
     c2 |     2.0642
      c3 |  -.2057121
------------------------------------------------------------------------------

Comment

karamat khan

Join Date: Feb 2019

Posts: 40
#4

22 Jul 2020, 22:24

one more point i want to add, you said that i cannot specify heteroscedasticity-errors or may be cluster but i have many many papers published in very good journals, where the researches used same datasets im using and they used similar approach, robust or cluster errors with probit regression.. what is your opinion about this?
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#5

23 Jul 2020, 01:40

Karamat, do not show the bits and pieces that you think are important. Post all the Stata output with all error messages from beginning to end.

My guess is that your second -margins- call failed, and somehow -margins- is retrieving/replaying the results from the first successful call to -margins-.
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#6

23 Jul 2020, 01:45

You can test my theory by typing immediately after your first call to -margins-

Code:

return clear
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2163
#7

23 Jul 2020, 07:53

Hi karamat: There are actually a few threads discussing how, currently, margins is not working correctly with ivprobit. Here is one, but if you do a search you'll see others started by the Stata people.

ivprobit_and_margins

If you report "robust" standard errors then you are admitting the model is wrong and, because this is MLE, the estimators are generally inconsistent (but can still be useful). Probably we should always make our standard errors robust to model misspecification, but, currently, the profession only seems to endorse this when at least some feature -- such as the mean -- is correctly specified. Clustering is always allowed if there is the appropriate structure because it is not a model specification issue. It is either a sampling issue or an assignment issue (that is, at which level is the policy assigned?)

JW
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#8

23 Jul 2020, 08:31

I was about to start taking issues with what Chris said, but in fact all his comments are about "advanced topics" and it is very risky to comment on those without proper supervision.

As Professor Wooldridge is here, we have proper supervision so I can go ahead:

1. -ivprobit- is a very misleading name, because this is not IV. It is either control function, or maximum likelihood. There is no instrumentation, and there are no predicted values in the first stage (if one goes for the control function approach there are predicted residuals in the first stage). It is not useful, and if fact it is quite misleading to think of -ivprobit- as an instrumental variable estimator, because it is not.

2. One can include fixed effects in -probit- and -ivprobit-. However the fixed effects should not grow with the sample size. E.g.,
a) one has panel data on individuals i=1,2..N, over periods t=1,2..T, and one assumes that T is fixed but N goes to infinity. Including N individual fixed effects here leads to the accidental parameters problem and does not lead to consistent estimates.
b) However it is appropriate if one wants to include say industry fixed effects. Even if one has many industries, still industries can be thought of as fixed, while the number of observations within industry accumulate. We do not have the accidental parameters problem here.

3. There is nothing wrong in calculating robust variance post -probit- , -ivprobit- or other nonlinear models, as long as one understands the limitations. In particular in probit
a) robust standard errors do not guard against heteroskedasticity in the latent variable. To get probit one has to have latent variable with variance of 1.
b) robust standard errors still guard against limited misspecification, if your first order conditions are correct, but you have some other misspecification in the model. The issue is decently explained in this Stata post
https://blog.stata.com/2016/08/30/tw...andard-errors/

Originally posted by Jeff Wooldridge View Post

Hi karamat: There are actually a few threads discussing how, currently, margins is not working correctly with ivprobit. Here is one, but if you do a search you'll see others started by the Stata people.

ivprobit_and_margins

If you report "robust" standard errors then you are admitting the model is wrong and, because this is MLE, the estimators are generally inconsistent (but can still be useful). Probably we should always make our standard errors robust to model misspecification, but, currently, the profession only seems to endorse this when at least some feature -- such as the mean -- is correctly specified. Clustering is always allowed if there is the appropriate structure because it is not a model specification issue. It is either a sampling issue or an assignment issue (that is, at which level is the policy assigned?)

JW
Comment
karamat khan

Join Date: Feb 2019

Posts: 40
#9

23 Jul 2020, 23:12

i am really thankful for your kind input ... i will look into your suggestions and if i had anymore questions then hopefully i will get your help again... and i am really sorry regarding duplicate post that happend becz of system error.. i tried to delete one but i could find the delete option... once again thanks
Comment

Announcement