Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • (Slightly) different coefficients in Probit vs. 1st step Heckprobit

    Dear members,

    I am running a Heckprobit and out of curiosity I used the exactly same variables used on the first step to run a "normal" probit regression.
    Although coefficients are similar and significance is the same, I would expect they to be equal.
    I am wondering: why is that?

    Another related question: I am interested in the first step (the probability of individuals within a sector to participate on a given program) as well as in the second step (the probability that these participants on the program to be successful to achieve a certain goal).
    So it would be useful to have the joint significance for the first step as well as for the second step (Pseudo R2, LR etc.). Is it possible? Or is there any reason I need to do separate operations? (A probit to study the first step and a heckprobit to study the second step).

    Thank you very much for your support!

    Best,

    MM

  • #2
    Showing code and output (using code tags) could help. Are the Ns identical? I don't know if Heckprobit could cause you to lose some cases because of missing data or not.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      Dear @Richard, thank you for your rapid response.

      I am putting here the code and output:

      This is what I get for heckprobit (no exclusionary variable for the moment, I am not really sure whether heckprobit is the right way to go, just exploring for now).

      Code:
      . heckprobit suc age agesq revless30 city dummyx contract, select(partic= age agesq revless30 city dummyx contract)
      
      Fitting probit model:
      
      Iteration 0:   log likelihood = -77.904291  
      Iteration 1:   log likelihood = -72.131298  
      Iteration 2:   log likelihood = -72.108495  
      Iteration 3:   log likelihood = -72.108494  
      
      Fitting selection model:
      
      Iteration 0:   log likelihood = -382.85116  
      Iteration 1:   log likelihood = -354.83728  
      Iteration 2:   log likelihood = -354.23413  
      Iteration 3:   log likelihood = -354.23109  
      Iteration 4:   log likelihood = -354.23109  
      
      Comparison:    log likelihood = -426.33958
      
      Fitting starting values:
      
      Iteration 0:   log likelihood = -80.405073  
      Iteration 1:   log likelihood = -72.044547  
      Iteration 2:   log likelihood = -71.999982  
      Iteration 3:   log likelihood = -71.999921  
      Iteration 4:   log likelihood = -71.999921  
      
      Fitting full model:
      
      initial values not feasible
      note:  default initial values infeasible; starting from B=0
      
      Iteration 0:   log likelihood = -923.27204  (not concave)
      Iteration 1:   log likelihood = -468.88945  (not concave)
      Iteration 2:   log likelihood = -441.33546  
      Iteration 3:   log likelihood = -426.43601  (not concave)
      Iteration 4:   log likelihood = -426.36447  
      Iteration 5:   log likelihood = -426.33246  
      Iteration 6:   log likelihood = -426.33239  
      Iteration 7:   log likelihood = -426.33226  
      Iteration 8:   log likelihood = -426.33224  
      Iteration 9:   log likelihood = -426.33222  
      
      Probit model with sample selection              Number of obs     =      1,216
                                                      Censored obs      =      1,100
                                                      Uncensored obs    =        116
      
                                                      Wald chi2(6)      =      16.21
      Log likelihood = -426.3322                      Prob > chi2       =     0.0127
      
      ------------------------------------------------------------------------------
                   |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      suc          |
               age |  -.1133622    .082321    -1.38   0.168    -.2747083     .047984
             agesq |   .0009123   .0010269     0.89   0.374    -.0011003     .002925
         revless30 |   .2575267   .5897536     0.44   0.662    -.8983691    1.413423
              city |   .2051909   .3833563     0.54   0.592    -.5461737    .9565555
            dummyx |   .5615969   .2654081     2.12   0.034     .0414066    1.081787
          contract |   .5429536   1.667169     0.33   0.745    -2.724638    3.810545
             _cons |   2.032089   5.578397     0.36   0.716    -8.901368    12.96555
      -------------+----------------------------------------------------------------
      partic      |
               age |  -.0644931   .0341152    -1.89   0.059    -.1313576    .0023713
             agesq |    .000347   .0003355     1.03   0.301    -.0003105    .0010045
         revless30 |   .2512576   .1289441     1.95   0.051    -.0014681    .5039834
             paris |   .1756611   .1066171     1.65   0.099    -.0333046    .3846268
            dummyx |   .2739717   .1267822     2.16   0.031     .0254831    .5224603
          contract |  -.0703994   .1324992    -0.53   0.595    -.3300931    .1892942
             _cons |   .6850527   .8647592     0.79   0.428    -1.009844     2.37995
      -------------+----------------------------------------------------------------
           /athrho |   .5562234   5.806931     0.10   0.924    -10.82515     11.9376
      -------------+----------------------------------------------------------------
               rho |     .50517   4.325022                            -1           1
      ------------------------------------------------------------------------------
      LR test of indep. eqns. (rho = 0):   chi2(1) =     0.01   Prob > chi2 = 0.9034

      And this is what I get for the probit.

      Code:
      . probit partic age agesq revless30 city dummyx contract
      
      Iteration 0:   log likelihood = -382.85116  
      Iteration 1:   log likelihood = -354.83728  
      Iteration 2:   log likelihood = -354.23413  
      Iteration 3:   log likelihood = -354.23109  
      Iteration 4:   log likelihood = -354.23109  
      
      Probit regression                               Number of obs     =      1,216
                                                      LR chi2(6)        =      57.24
                                                      Prob > chi2       =     0.0000
      Log likelihood = -354.23109                     Pseudo R2         =     0.0748
      
      ------------------------------------------------------------------------------
            partic |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               age |  -.0649383   .0338336    -1.92   0.055    -.1312509    .0013744
             agesq |   .0003512   .0003329     1.06   0.291    -.0003012    .0010036
         revless30 |   .2490954   .1272836     1.96   0.050    -.0003758    .4985667
              city |   .1751075   .1061634     1.65   0.099     -.032969     .383184
            dummyx |   .2732354   .1266242     2.16   0.031     .0250565    .5214144
          contract |  -.0711592   .1323802    -0.54   0.591    -.3306197    .1883013
             _cons |   .6984942   .8546858     0.82   0.414    -.9766592    2.373648
      ------------------------------------------------------------------------------
      Thank you!

      Comment


      • #4
        The variables in your heckprobit selection equation differ from those in the probit model. The former includes a variable paris while the latter includes a variable city instead. It is thus not surprising that the estimates differ.
        https://www.kripfganz.de/stata/

        Comment


        • #5
          Sebastian Kripfganz "paris" and "city" are the same variable with different names. I assure you the variables are identical = I copied from one and ran the other.

          Comment


          • #6
            Sebastian Kripfganz Richard Williams I re-ran the whole thing.
            The first estimation I use the Heckit with OLS in the second stage. The first stage seems to be identical to the one that I get from the probit. But the Heckprobit continues to be just similar, but not identical (as I would have expected)...

            Code:
            heckman suc age agesq revless30 city manager contract, select(partic= age agesq revless30 city manager contract)
            
            Iteration 0:   log likelihood = -482.06279  (not concave)
            Iteration 1:   log likelihood =  -454.9629  
            Iteration 2:   log likelihood = -445.18871  (not concave)
            Iteration 3:   log likelihood = -434.89064  
            Iteration 4:   log likelihood = -431.37322  
            Iteration 5:   log likelihood = -430.55138  
            Iteration 6:   log likelihood = -430.32266  
            Iteration 7:   log likelihood = -430.31266  
            Iteration 8:   log likelihood = -430.31033  
            Iteration 9:   log likelihood = -430.31023  
            Iteration 10:  log likelihood = -430.31023  
            
            Heckman selection model                         Number of obs     =      1,216
            (regression model with sample selection)        Censored obs      =      1,100
                                                            Uncensored obs    =        116
            
                                                            Wald chi2(6)      =       7.19
            Log likelihood = -430.3102                      Prob > chi2       =     0.3040
            
            ------------------------------------------------------------------------------
                         |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            suc          |
                     age |  -.0340842   .0368359    -0.93   0.355    -.1062813    .0381129
                   agesq |   .0003038   .0003384     0.90   0.369    -.0003594    .0009669
               revless30 |   .0731797   .1519164     0.48   0.630     -.224571    .3709303
                   city |   .0652489   .1119053     0.58   0.560    -.1540815    .2845794
                manager |   .1751165   .1408441     1.24   0.214    -.1009327    .4511658
                  contract |   .2174551   .1147835     1.89   0.058    -.0075164    .4424265
                   _cons |   1.285449   .6865569     1.87   0.061    -.0601779    2.631076
            -------------+----------------------------------------------------------------
             partic      |
                     age |  -.0648989    .033859    -1.92   0.055    -.1312614    .0014635
                   agesq |   .0003508   .0003331     1.05   0.292    -.0003021    .0010037
               revless30 |   .2492594   .1273853     1.96   0.050    -.0004111      .49893
                     city |   .1751207   .1061642     1.65   0.099    -.0329573    .3831988
                  manager |    .273276   .1266296     2.16   0.031     .0250865    .5214655
               contract |  -.0711256   .1323801    -0.54   0.591    -.3305858    .1883346
                   _cons |   .6973707   .8554497     0.82   0.415    -.9792799    2.374021
            -------------+----------------------------------------------------------------
                 /athrho |   .0315025   .9957658     0.03   0.975    -1.920163    1.983168
                /lnsigma |  -.7626778   .0704461   -10.83   0.000    -.9007496   -.6246061
            -------------+----------------------------------------------------------------
                     rho |    .031492   .9947783                     -.9579307    .9628189
                   sigma |   .4664158   .0328572                       .406265    .5354723
                  lambda |   .0146884   .4643564                     -.8954335    .9248102
            ------------------------------------------------------------------------------
            LR test of indep. eqns. (rho = 0):   chi2(1) =     0.00   Prob > chi2 = 0.9755
            Code:
            . probit partic age agesq revless30 city manager contract
            
            Iteration 0:   log likelihood = -382.85116  
            Iteration 1:   log likelihood = -354.83728  
            Iteration 2:   log likelihood = -354.23413  
            Iteration 3:   log likelihood = -354.23109  
            Iteration 4:   log likelihood = -354.23109  
            
            Probit regression                               Number of obs     =      1,216
                                                            LR chi2(6)        =      57.24
                                                            Prob > chi2       =     0.0000
            Log likelihood = -354.23109                     Pseudo R2         =     0.0748
            
            ------------------------------------------------------------------------------
                 partic |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                     age |  -.0649383   .0338336    -1.92   0.055    -.1312509    .0013744
                   agesq |   .0003512   .0003329     1.06   0.291    -.0003012    .0010036
               revless30 |   .2490954   .1272836     1.96   0.050    -.0003758    .4985667
                    city |   .1751075   .1061634     1.65   0.099     -.032969     .383184
                 manager |   .2732354   .1266242     2.16   0.031     .0250565    .5214144
                contract |  -.0711592   .1323802    -0.54   0.591    -.3306197    .1883013
                   _cons |   .6984942   .8546858     0.82   0.414    -.9766592    2.373648
            ------------------------------------------------------------------------------
            Code:
            . heckprobit suc age agesq revless30 city manager contract, select(partic= age agesq revless30 city manager contract)
            
            Fitting probit model:
            
            Iteration 0:   log likelihood = -77.904291  
            Iteration 1:   log likelihood = -72.131298  
            Iteration 2:   log likelihood = -72.108495  
            Iteration 3:   log likelihood = -72.108494  
            
            Fitting selection model:
            
            Iteration 0:   log likelihood = -382.85116  
            Iteration 1:   log likelihood = -354.83728  
            Iteration 2:   log likelihood = -354.23413  
            Iteration 3:   log likelihood = -354.23109  
            Iteration 4:   log likelihood = -354.23109  
            
            Comcityon:    log likelihood = -426.33958
            
            Fitting starting values:
            
            Iteration 0:   log likelihood = -80.405073  
            Iteration 1:   log likelihood = -72.044547  
            Iteration 2:   log likelihood = -71.999982  
            Iteration 3:   log likelihood = -71.999921  
            Iteration 4:   log likelihood = -71.999921  
            
            Fitting full model:
            
            initial values not feasible
            note:  default initial values infeasible; starting from B=0
            
            Iteration 0:   log likelihood = -923.27204  (not concave)
            Iteration 1:   log likelihood = -468.88945  (not concave)
            Iteration 2:   log likelihood = -441.33546  
            Iteration 3:   log likelihood = -426.43601  (not concave)
            Iteration 4:   log likelihood = -426.36447  
            Iteration 5:   log likelihood = -426.33246  
            Iteration 6:   log likelihood = -426.33239  
            Iteration 7:   log likelihood = -426.33226  
            Iteration 8:   log likelihood = -426.33224  
            Iteration 9:   log likelihood = -426.33222  
            
            Probit model with sample selection              Number of obs     =      1,216
                                                            Censored obs      =      1,100
                                                            Uncensored obs    =        116
            
                                                            Wald chi2(6)      =      16.21
            Log likelihood = -426.3322                      Prob > chi2       =     0.0127
            
            ------------------------------------------------------------------------------
                         |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            suc          |
                     age |  -.1133622    .082321    -1.38   0.168    -.2747083     .047984
                   agesq |   .0009123   .0010269     0.89   0.374    -.0011003     .002925
               revless30 |   .2575267   .5897536     0.44   0.662    -.8983691    1.413423
                    city |   .2051909   .3833563     0.54   0.592    -.5461737    .9565555
                manager |   .5615969   .2654081     2.12   0.034     .0414066    1.081787
               contract |   .5429536   1.667169     0.33   0.745    -2.724638    3.810545
                   _cons |   2.032089   5.578397     0.36   0.716    -8.901368    12.96555
            -------------+----------------------------------------------------------------
             partic      |
                     age |  -.0644931   .0341152    -1.89   0.059    -.1313576    .0023713
                   agesq |    .000347   .0003355     1.03   0.301    -.0003105    .0010045
               revless30 |   .2512576   .1289441     1.95   0.051    -.0014681    .5039834
                     city |   .1756611   .1066171     1.65   0.099    -.0333046    .3846268
                 manager |   .2739717   .1267822     2.16   0.031     .0254831    .5224603
               contract |  -.0703994   .1324992    -0.53   0.595    -.3300931    .1892942
                   _cons |   .6850527   .8647592     0.79   0.428    -1.009844     2.37995
            -------------+----------------------------------------------------------------
                 /athrho |   .5562234   5.806931     0.10   0.924    -10.82515     11.9376
            -------------+----------------------------------------------------------------
                     rho |     .50517   4.325022                            -1           1
            ------------------------------------------------------------------------------
            LR test of indep. eqns. (rho = 0):   chi2(1) =     0.01   Prob > chi2 = 0.9034

            Comment


            • #7
              On heckman, if you add the twostep option; and on heckprob, if you add the first option; I think you'll see that the results perfectly match the results from probit alone.

              Sorry, I don't understand either well enough to explain why this is, but you should be able to confirm for yourself.
              -------------------------------------------
              Richard Williams, Notre Dame Dept of Sociology
              StataNow Version: 19.5 MP (2 processor)

              EMAIL: [email protected]
              WWW: https://www3.nd.edu/~rwilliam

              Comment


              • #8
                I agree with Richard.

                The reason for the small differences should be that heckman and heckprobit by default obtain full maximum likelihood estimates, i.e. they estimate both equations jointly. Since there is a nonzero correlation between the error terms of the two equations, the estimation of the second equation affects the estimates of the selection equation. This is not the case in a twostep procedure. Both ways yield consistent estimates. The full maximum likelihood approach is more efficient but the twostep approach is easier to estimate.
                https://www.kripfganz.de/stata/

                Comment


                • #9
                  Richard Williams Sebastian Kripfganz Thank you so much for the explanation!

                  It bugged me because I had understood they were estimated sequentially no matter what, but your explanations make everything completely clear.

                  Thank you again!

                  Have a nice day! =)

                  Comment

                  Working...
                  X