Using marginsplot to plot correct confidence intervals for predicted probabilities after logistic regression

EM Rinke

Join Date: May 2016

Posts: 14
#1

Using marginsplot to plot correct confidence intervals for predicted probabilities after logistic regression

15 May 2016, 18:24

I am using Stata 14 to fit a logistic model (via melogit) and I am getting confidence intervals below 0 and above 1 when I predict probabilities using the margins command, as discussed in this thread.

These impossible (or rather, wrongly scaled) values seem to obtain because Stata uses the delta method to estimate the standard errors of the predicted probabilities and generates the CIs from that standard error instead of estimating the standard errors from the log odds and generating the CIs from them (as discussed here).

My question are:

(a) Is it possible to obtain "correct" CIs based on the lower and upper bounds of a 95% confidence interval for the linear prediction that are then converted into "correct" probabilities, for example using the transform_margins command provided by Jeff Pitblado (see first thread linked above)?

I use

Code:

margins, at(X=(1(1)100)) predict(xb fixedonly)

to obtain linear predictions of the fixed model component and then

Code:

transform_margins invlogit(@)

to obtain CIs based on the transformed marginal linear prediction.

These CIs now range from 0-1, unlike the CIs for the predicted probabilities that can be obtained from margins itself, e.g. like so:

Code:

margins, at(X=(1(1)100)) predict(mu fixedonly)

Are the CIs obained through this application of transform_margins indeed "correct" (in the sense of the FAQ linked above)?

(b) Is there a way to plot the "correct" CIs for predicted marginal probabilities after melogit or any other GLM command (ranging from 0-1), using the marginsplot command or some workaround? The simple

Code:

marginsplot, recastci(rarea) recast(line)

obviously does not plot results obtained from transform_margins.

Thanks,
Eike

Last edited by EM Rinke; 15 May 2016, 18:27.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

16 May 2016, 08:39

The -marginsplot- command, invoked after transform_margins, does not plot the transformed margins, it plots the ones originally generated by -margins-.

You can run -margins, predict(xb)- with the (undocumented) -saving()- option to obtain a .dta file containing the -margins- output. You can then apply the invlogit() function to the estimated value and its lower and upper confidence limits in that data set, and then use -graph twoway connect- to create a plot that does what you want.

As for the "correctness" of confidence intervals, the only meaning of that term that I am familiar with is that the coverage probability under repeated sampling be as advertised. In that regard, confidence intervals that include impossible values of the parameter, can nevertheless be "correct." It also bears mentioning that many procedures that estimate confidence intervals are understood to provide results that only approximate the coverage probability. To the best of my knowledge, none of the usual procedures for confidence interval estimation in logistic models produce results with exact coverage probability (although asymptotically this may be the case).
Comment
EM Rinke

Join Date: May 2016

Posts: 14
#3

18 May 2016, 16:16

Thank you, Clyde! The workaround using the undocumented -saving()- option to obtain linear predictions and using the inverse logit function works. The only issue is that the point estimate for the predicted marginal probabilities gained through this procedure differ slightly from those obtained through -margins, predict(mu fixedonly)-, mostly by 1-2 points. I am wondering: Why does this difference occur? Should the inverse logit of the linear point estimates not correspond exactly to the probabilities predicted by -margins, predict(mu fixedonly)-?

I am also wondering how to best refer to the probability-scaled CIs generated through the workaround -- these are, of course, still CIs based on delta-method SEs, just the SEs of the predicted log odds (before transformation to the probability scale). Do we have a straightforward descriptor/label for that type of CI, one that renders unnecessary the full, lengthy description of the calculation process when reporting it?

As for CI "correctness", I fully agree with your view that impossible parameter values may still be correct given that CIs are statements about long-run coverage of the parameter in question. Because this is so, it would also seem acceptable to me to simply take the standard CIs for the predictive margins produced by Stata and restrict the plotted range for the CIs produced by -marginsplot- to between 0 and 1. However, I have not seen any way of doing this in Stata. What -marginsplot- does is to plot the entire CI range (including the impossible values) and compress the y-axis, which is unfortunate, especially if you want to combine several predicted probability graphs in a single plot -- in that case the scales on the axes ideally should be equal across the graphs. Any thoughts on this?
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30100

18 May 2016, 16:46

I just tried what you described using the -melogit- example data set bangladesh.dta and find as you do: applying invlogit() to the output of -margins, predict(xb)- does not exactly reproduce the results of -margins, predict(mu fixedonly)-. I do not understand why this is happening. My understanding, like yours, is that they should be exactly the same.

Code:

. clear*

. use http://www.stata-press.com/data/r14/bangladesh.dta
(Bangladesh Fertility Survey, 1989)

. gen byte children = 0

. forvalues i = 1/3 {
  2.         replace children = `i' if child`i' == 1
  3. }
(354 real changes made)
(307 real changes made)
(743 real changes made)

. melogit c_use i.urban c.age i.children || district:

Fitting fixed-effects model:

Iteration 0:   log likelihood = -1229.5485  
Iteration 1:   log likelihood = -1228.5268  
Iteration 2:   log likelihood = -1228.5263  
Iteration 3:   log likelihood = -1228.5263  

Refining starting values:

Grid node 0:   log likelihood = -1219.2681

Fitting full model:

Iteration 0:   log likelihood = -1219.2681  (not concave)
Iteration 1:   log likelihood = -1207.5978  
Iteration 2:   log likelihood = -1206.8428  
Iteration 3:   log likelihood = -1206.8322  
Iteration 4:   log likelihood = -1206.8322  

Mixed-effects logistic regression               Number of obs     =      1,934
Group variable:        district                 Number of groups  =         60

                                                Obs per group:
                                                              min =          2
                                                              avg =       32.2
                                                              max =        118

Integration method: mvaghermite                 Integration pts.  =          7

                                                Wald chi2(5)      =     109.60
Log likelihood = -1206.8322                     Prob > chi2       =     0.0000
------------------------------------------------------------------------------
       c_use |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
             |
       urban |
      urban  |   .7322765   .1194857     6.13   0.000     .4980888    .9664641
         age |  -.0264981   .0078916    -3.36   0.001    -.0419654   -.0110309
             |
    children |
          1  |   1.116001   .1580921     7.06   0.000     .8061465    1.425856
          2  |   1.365895   .1746691     7.82   0.000      1.02355     1.70824
          3  |   1.344031   .1796549     7.48   0.000     .9919139    1.696148
             |
       _cons |   -1.68929   .1477591   -11.43   0.000    -1.978892   -1.399687
-------------+----------------------------------------------------------------
district     |
   var(_cons)|    .215618   .0733222                      .1107208    .4198954
------------------------------------------------------------------------------
LR test vs. logistic model: chibar2(01) = 43.39       Prob >= chibar2 = 0.0000

.
. margins i.urban, predict(xb)

Predictive margins                              Number of obs     =      1,934
Model VCE    : OIM

Expression   : Linear prediction, fixed portion only, predict(xb)

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       urban |
      rural  |  -.7519041   .0892822    -8.42   0.000     -.926894   -.5769142
      urban  |  -.0196276   .1206743    -0.16   0.871     -.256145    .2168897
------------------------------------------------------------------------------

. matrix M = r(table)

. display invlogit(M[1,1]), invlogit(M[1,2])
.32040655 .49509325

.
. margins i.urban, expression(invlogit(predict(xb)))

Predictive margins                              Number of obs     =      1,934
Model VCE    : OIM

Expression   : invlogit(predict(xb))

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       urban |
      rural  |   .3300778   .0187606    17.59   0.000     .2933077    .3668479
      urban  |    .497064   .0285292    17.42   0.000     .4411478    .5529803
------------------------------------------------------------------------------

.
. margins i.urban, predict(mu fixedonly)

Predictive margins                              Number of obs     =      1,934
Model VCE    : OIM

Expression   : Predicted mean, fixed portion only, predict(mu fixedonly)

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       urban |
      rural  |   .3300778   .0187606    17.59   0.000     .2933077    .3668479
      urban  |    .497064   .0285292    17.42   0.000     .4411478    .5529803
------------------------------------------------------------------------------

Now, as the output above shows, I also realized that you don't need to go through the trouble of saving the -margins, predict(xb)- output and applying invlogit() because you can run -margins i.urban, expression(invlogit(predict(xb)))-, and you can run -marginsplot- after that to get what you originally wanted: a margins plot with CIs that don't go beyond 0 or 1. This approach will match the predicted probabilities given by -margins, predict(mu fixedonly)-. But now I'm not sure which is correct.

Perhaps I am missing something about the difference between -predict(mu fixedonly)- and invlogit(xb) and somebody else can explain it.

Added: With apologies, my suggestion to use -margins, expression(invlogit(predict(xb)))- will not do what you want because it will apply the delta method after taking invlogit--which is precisely what you want to avoid. So you do need to save the output of -margins, xb- and apply invlogit() to the CI values in that data set.

I still have no insight into why doing that does not yield the same predicted probabilities as -margins, predict(mu fixedonly)-.

Last edited by Clyde Schechter; 18 May 2016, 17:05.

Comment

EM Rinke

Join Date: May 2016

Posts: 14
#5

18 May 2016, 17:30

Thank you, Clyde, for further looking into this! To all others: Any explanation about why said differences between the predicted probabilities obtained through -margins, predict(mu fixedonly)- and through the process described by Clyde above occur will be very welcomed!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#6

18 May 2016, 18:33

Wait, I see what's going on now. My logic was wrong from the start.

When we run -margins, predict(xb)- and then apply invlogit() to the results, that is not going to give us the same result as -margins, predict(mu fixed only)- because invlogit() is not a linear function. The -margins- calculations are means. mean(f(x)) != f(mean(x)) when f is non-linear.

So my original solution was misguided in the first place. Sorry for leading you down the garden path. Elementary error--egg on my face.

So, what you can do along these lines is get confidence intervals that are trimmed to 0 below and 1 above:

Code:

margins, at(X = (1(1)100)) predict(mu fixedonly) saving(margins, replace) use margins, clear replace _ci_lb = 0 if _ci_lb < 0 replace _ci_ub = 1 if _ci_ub > 1

and then do your plot from there.
1 like
Comment
EM Rinke

Join Date: May 2016

Posts: 14
#7

19 May 2016, 18:52

This makes complete sense -- thank you very much, Clyde, for solving the mystery! Also for the graphing solution which I will go with. It improves the graph a lot, especially when combined with others.
Comment

Announcement