Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reporting contrasts of marginal effects from logit/probit models

    Let's say I have a model to predict the probability of an outcome, and I am interested in reporting the change at two representative values. To make things concrete, consider this example which predicts the probability of living in the city centre given a broad category of educational achievement.

    Code:
    . sysuse nlsw88
    . qui recode grade (0/8 = 1) (9/12 = 2) (12/max=3), gen(educ)
    . logit c_city ibn.educ, nocons
    
    ------------------------------------------------------------------------------
          c_city |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            educ |
              1  |  -.9555114   .2631174    -3.63   0.000    -1.471212   -.4398108
              2  |  -.9834933   .0647227   -15.20   0.000    -1.110348   -.8566391
              3  |  -.7701687   .0691436   -11.14   0.000    -.9056876   -.6346498
    ------------------------------------------------------------------------------
    From the model, I'm interested in the change from the lowest to the highest level of educational achievement. On the log-odds scale, this is easily done in (at least) two ways using a linear contrast of the coefficients.

    Code:
    . contrast {educ -1 0 1}
    
    Contrasts of marginal linear predictions
    
    Margins      : asbalanced
    
    ------------------------------------------------
                 |         df        chi2     P>chi2
    -------------+----------------------------------
            educ |          1        0.46     0.4957
    ------------------------------------------------
    
    --------------------------------------------------------------
                 |   Contrast   Std. Err.     [95% Conf. Interval]
    -------------+------------------------------------------------
            educ |
            (1)  |   .1853427   .2720507     -.3478669    .7185524
    --------------------------------------------------------------
    
    . lincom 3.educ - 1.educ
    
     ( 1)  - [c_city]1bn.educ + [c_city]3.educ = 0
    
    ------------------------------------------------------------------------------
          c_city |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             (1) |   .1853427   .2720507     0.68   0.496    -.3478669    .7185524
    ------------------------------------------------------------------------------
    Or, I can transform to the probability scale, and directly estimate the absolutely change in probability between these two levels.

    Code:
    . margins i.educ
    
    Adjusted predictions Number of obs = 2,244
    Model VCE : OIM
    
    Expression : Pr(c_city), predict()
    
    ------------------------------------------------------------------------------
    | Delta-method
    | Margin Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    educ |
    1 | .2777778 .0527859 5.26 0.000 .1743193 .3812362
    2 | .2721992 .012822 21.23 0.000 .2470685 .2973299
    3 | .3164426 .0149562 21.16 0.000 .287129 .3457563
    ------------------------------------------------------------------------------
    
    . margins {educ -1 0 1}
    
    Contrasts of adjusted predictions
    Model VCE : OIM
    
    Expression : Pr(c_city), predict()
    
    ------------------------------------------------
    | df chi2 P>chi2
    -------------+----------------------------------
    educ | 1 0.50 0.4810
    ------------------------------------------------
    
    --------------------------------------------------------------
    | Delta-method
    | Contrast Std. Err. [95% Conf. Interval]
    -------------+------------------------------------------------
    educ |
    (1) | .0386648 .0548638 -.0688663 .146196
    --------------------------------------------------------------
    
    . nlcom invlogit(_b[3.educ]) - invlogit(_b[1.educ])
    
    _nl_1: invlogit(_b[3.educ]) - invlogit(_b[1.educ])
    
    ------------------------------------------------------------------------------
    c_city | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    _nl_1 | .0386648 .0548638 0.70 0.481 -.0688663 .146196
    ------------------------------------------------------------------------------
    In this example, the p-values and the relative standard errors happen to be similar, but to the non-linear relationship between predictors and probability, this isn't always the case. When probabilities are closer to extremes, the change in odds tends to be estimated with better precision than the change in probability. If I were only drawing an inference about the average marginal effects I would stay with the original metric of the model (e.g., log-odds) because the implied hypothesis test of the marginal effect on the probability scale that Pr-hat(y|x) = 0, is non-sense.

    Since I am most interested in directly estimating the change in probability, rather than the change in odds, should inferences be made then on the probability scale? Or would I needlessly be sacrificing precision by not staying in the log-odds scale?
    Last edited by Leonardo Guizzetti; 19 Apr 2019, 09:24. Reason: Fixed code tags.

  • #2
    Can you clarify what you mean by "sacrificing precision"?

    Comment


    • #3
      By precision I'm referring to the relative standard error (or coefficient of variation). A smaller RSE is often obtained from the logit scale compared to the probability scale with my datasets.

      Comment

      Working...
      X