Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference between margins and predict

    Hi guys,

    I just noticed something weird and wanted your opinion, I'm probably misunderstanding how margins works so any help would be appreciated.

    I get different values when I check predictions using margins versus when I check them using predict.

    Here's what I'm doing

    Code:
    use http://www.stata-press.com/data/r13/sysdsn1 , clear
    
    
    . mprobit insure age i.male nonwhite i.site 
    
    Iteration 0:   log likelihood = -535.89424  
    Iteration 1:   log likelihood = -534.56173  
    Iteration 2:   log likelihood = -534.52835  
    Iteration 3:   log likelihood = -534.52833  
    
    Multinomial probit regression                   Number of obs     =        615
                                                    Wald chi2(10)     =      40.18
    Log likelihood = -534.52833                     Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
          insure |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    Indemnity    |  (base outcome)
    -------------+----------------------------------------------------------------
    Prepaid      |
             age |  -.0098536   .0052688    -1.87   0.061    -.0201802     .000473
          1.male |   .4774678   .1718316     2.78   0.005     .1406841    .8142515
        nonwhite |   .8245003   .1977582     4.17   0.000     .4369013    1.212099
                 |
            site |
              2  |   .0973956   .1794546     0.54   0.587    -.2543289    .4491201
              3  |   -.495892   .1904984    -2.60   0.009     -.869262   -.1225221
                 |
           _cons |     .22315   .2792424     0.80   0.424     -.324155    .7704549
    -------------+----------------------------------------------------------------
    Uninsure     |
             age |  -.0050814   .0075327    -0.67   0.500    -.0198452    .0096823
          1.male |   .3332637   .2432986     1.37   0.171    -.1435929    .8101203
        nonwhite |   .2485859   .2767734     0.90   0.369      -.29388    .7910518
                 |
            site |
              2  |  -.6899485   .2804497    -2.46   0.014     -1.23962   -.1402771
              3  |  -.1788447   .2479898    -0.72   0.471    -.6648957    .3072063
                 |
           _cons |  -.9855917   .3891873    -2.53   0.011    -1.748385   -.2227986
    ------------------------------------------------------------------------------
    I run a basic multinomial probit.

    Code:
    predict p1-p3 if e(sample), pr
    
    table male, content(mean p1 mean p2 mean p3)
    
    
    ----------------------------------------------
    NEMC      |
    PATIENT   |
    MALE      |   mean(p1)    mean(p2)    mean(p3)
    ----------+-----------------------------------
            0 |   .5053142    .4249825    .0697033
            1 |   .3904685    .5264811    .0830504
    Grab the predicted probabilities for the three outcomes and compare these for males in a table

    Code:
    . margins i.male
    
    Predictive margins                              Number of obs     =        615
    Model VCE    : OIM
    
    1._predict   : Pr(insure==Indemnity), predict(pr outcome(1))
    2._predict   : Pr(insure==Prepaid), predict(pr outcome(2))
    3._predict   : Pr(insure==Uninsure), predict(pr outcome(3))
    
    -------------------------------------------------------------------------------
                  |            Delta-method
                  |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    --------------+----------------------------------------------------------------
    _predict#male |
             1 0  |   .5089435   .0229832    22.14   0.000     .4638972    .5539897
             1 1  |   .3804479   .0386202     9.85   0.000     .3047536    .4561422
             2 0  |   .4207201   .0225325    18.67   0.000     .3765572     .464883
             2 1  |   .5390436   .0395344    13.63   0.000     .4615576    .6165296
             3 0  |   .0703364   .0119072     5.91   0.000     .0469988    .0936741
             3 1  |   .0805084   .0217201     3.71   0.000     .0379378    .1230791
    -------------------------------------------------------------------------------
    I then double check these using margins command.


    The differences are kind of minor, but I still don't get it. For the first outcome, men's values are 0.38 using margins and 0.39 using predict.

    Why do these differ?

  • #2
    predict p1-p3 if e(sample), pr

    table male, content(mean p1 mean p2 mean p3)


    ----------------------------------------------
    NEMC |
    PATIENT |
    MALE | mean(p1) mean(p2) mean(p3)
    ----------+-----------------------------------
    0 | .5053142 .4249825 .0697033
    1 | .3904685 .5264811 .0830504
    This is equivalent to estimating the marginal effects separately for males and females. You can do this using the -over- option or the -if- qualifier.

    Code:
    use http://www.stata-press.com/data/r13/sysdsn1 , clear
    mprobit insure age i.male nonwhite i.site
    margins, over(male)
    margins i.male if male
    margins i.male if !male
    Res.:

    Code:
    . margins, over(male)
    
    Predictive margins                              Number of obs     =        615
    Model VCE    : OIM
    
    over         : male
    1._predict   : Pr(insure==Indemnity), predict(pr outcome(1))
    2._predict   : Pr(insure==Prepaid), predict(pr outcome(2))
    3._predict   : Pr(insure==Uninsure), predict(pr outcome(3))
    
    -------------------------------------------------------------------------------
                  |            Delta-method
                  |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    --------------+----------------------------------------------------------------
    _predict#male |
             1 0  |   .5053142   .0228908    22.07   0.000      .460449    .5501793
             1 1  |   .3904685   .0387491    10.08   0.000     .3145217    .4664153
             2 0  |   .4249825   .0225007    18.89   0.000     .3808819    .4690831
             2 1  |   .5264811   .0394181    13.36   0.000      .449223    .6037391
             3 0  |   .0697033   .0117633     5.93   0.000     .0466476     .092759
             3 1  |   .0830504   .0220019     3.77   0.000     .0399275    .1261733
    -------------------------------------------------------------------------------
    
    .
    . margins i.male if male
    
    Predictive margins                              Number of obs     =        154
    Model VCE    : OIM
    
    1._predict   : Pr(insure==Indemnity), predict(pr outcome(1))
    2._predict   : Pr(insure==Prepaid), predict(pr outcome(2))
    3._predict   : Pr(insure==Uninsure), predict(pr outcome(3))
    
    -------------------------------------------------------------------------------
                  |            Delta-method
                  |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    --------------+----------------------------------------------------------------
    _predict#male |
             1 1  |   .3904685   .0387491    10.08   0.000     .3145217    .4664153
             2 1  |   .5264811   .0394181    13.36   0.000      .449223    .6037391
             3 1  |   .0830504   .0220019     3.77   0.000     .0399275    .1261733
    -------------------------------------------------------------------------------
    
    .
    . margins i.male if !male
    
    Predictive margins                              Number of obs     =        461
    Model VCE    : OIM
    
    1._predict   : Pr(insure==Indemnity), predict(pr outcome(1))
    2._predict   : Pr(insure==Prepaid), predict(pr outcome(2))
    3._predict   : Pr(insure==Uninsure), predict(pr outcome(3))
    
    -------------------------------------------------------------------------------
                  |            Delta-method
                  |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    --------------+----------------------------------------------------------------
    _predict#male |
             1 0  |   .5053142   .0228908    22.07   0.000      .460449    .5501793
             2 0  |   .4249825   .0225007    18.89   0.000     .3808819    .4690831
             3 0  |   .0697033   .0117633     5.93   0.000     .0466476     .092759
    -------------------------------------------------------------------------------
    
    .
    Last edited by Andrew Musau; 20 Aug 2020, 08:25.

    Comment


    • #3
      Thanks Andrew, what does it mean to estimate the marginal effects separately for males and females? They're different predicted probabilities, but what are the assumptions behind these? Sorry if this is a super basic question, I just always thought that marginal effects were the differences in predicted probabilities (margins, dydx(i.male)) rather than the way that margins is run.
      Last edited by Ivan Privalko; 20 Aug 2020, 09:36.

      Comment


      • #4
        After a regression, predict takes the equation and the estimated coefficients and simply computes the predicted value separately for each case in the data. So you get a predicted value for each case and then you can average them or process in any other form. What margins does is the following. With respect to the variable of interest, it manipulates this variable and creates a counterfactual where this variable of interest is changed but the rest is not touched. Based on the original regression equation but with the counterfactual data, it predicts the outcome and averages them.

        See this simple example here.
        Code:
        
        sysuse nlsw88, clear
        reg wage union ttl_exp
        
        
        
              Source |       SS           df       MS      Number of obs   =     1,878
        -------------+----------------------------------   F(2, 1875)      =    165.97
               Model |  4905.33688         2  2452.66844   Prob > F        =    0.0000
            Residual |  27708.1055     1,875  14.7776563   R-squared       =    0.1504
        -------------+----------------------------------   Adj R-squared   =    0.1495
               Total |  32613.4424     1,877  17.3753023   Root MSE        =    3.8442
        
        ------------------------------------------------------------------------------
                wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
               union |   1.282932   .2064182     6.22   0.000     .8780982    1.687765
             ttl_exp |   .3234277   .0192905    16.77   0.000     .2855947    .3612607
               _cons |   3.104682   .2650058    11.72   0.000     2.584944    3.624419
        ------------------------------------------------------------------------------
        
        
        predict p, xb
        sum p
        
            Variable |        Obs        Mean    Std. Dev.       Min        Max
        -------------+---------------------------------------------------------
                   p |      1,878    7.565423    1.616599      3.142   12.79259
        
        
        
        *What is the effect of union membership?
        reg wage i.union ttl_exp
        margins union
        
        
        ------------------------------------------------------------------------------
                     |            Delta-method
                     |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
               union |
           nonunion  |   7.250497   .1021582    70.97   0.000     7.050141    7.450853
              union  |   8.533429   .1792379    47.61   0.000     8.181902    8.884956
        ------------------------------------------------------------------------------
        
        
        
        *Recreate results of margins manually
        preserve
        replace union = 1 if union == 0    //What if everyone in the data was in a union?
        predict all_union, xb
        sum all_union
        restore
            Variable |        Obs        Mean    Std. Dev.       Min        Max
        -------------+---------------------------------------------------------
           all_union |      1,878    8.533429    1.489835   4.424932    13.7297
        
        
        
        preserve
        replace union = 0 if union == 1    //What if everyone in the data was not in a union?
        predict all_nonunion, xb
        sum all_nonunion
        restore
            Variable |        Obs        Mean    Std. Dev.       Min        Max
        -------------+---------------------------------------------------------
        all_nonunion |      1,878    7.250497    1.489835      3.142   12.44677
        Best wishes

        (Stata 16.1 MP)

        Comment


        • #5
          Thanks Felix, that's really useful.

          Comment

          Working...
          X