melogit plot observed, predicted and confidence interval

Vicky Taxiarchi

Join Date: Dec 2021
Posts: 11

melogit plot observed, predicted and confidence interval

29 Nov 2024, 05:09

Hello,

I run a relatively simple melogit model aiming to create a plot with the observed outcome %, the predicted and the 95% confidence interval of predictions. Mainly my problem is in failing to extract the correct 95CIs. I apologise in advance for the long message, but I hope it is easy to follow.

Code:

melogit outc sty sty2 [pweight=pweight] || pidp: , vce(cluster psu)

Fitting fixed-effects model:

Iteration 0:   log likelihood = -8085.4625  
Iteration 1:   log likelihood = -8075.9859  
Iteration 2:   log likelihood = -8075.9713  
Iteration 3:   log likelihood = -8075.9713  

Refining starting values:

Grid node 0:   log likelihood = -7812.3482

Fitting full model:

Iteration 0:   log pseudolikelihood = -7812.3482  
Iteration 1:   log pseudolikelihood = -7570.5406  
Iteration 2:   log pseudolikelihood = -7538.8878  
Iteration 3:   log pseudolikelihood = -7537.6672  
Iteration 4:   log pseudolikelihood = -7537.6738  
Iteration 5:   log pseudolikelihood = -7537.6742  

Mixed-effects logistic regression               Number of obs     =     18,648
Group variable:            pidp                 Number of groups  =     10,958

                                                Obs per group:
                                                              min =          1
                                                              avg =        1.7
                                                              max =          4

Integration method: mvaghermite                 Integration pts.  =          7

                                                Wald chi2(2)      =     157.47
Log pseudolikelihood = -7537.6742               Prob > chi2       =     0.0000
                                (Std. Err. adjusted for 3,492 clusters in psu)
------------------------------------------------------------------------------
             |               Robust
        outc |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         sty |  -.0110251   .0341653    -0.32   0.747    -.0779878    .0559377
        sty2 |   .0088006   .0023396     3.76   0.000      .004215    .0133862
       _cons |  -3.266937   .1279833   -25.53   0.000     -3.51778   -3.016095
-------------+----------------------------------------------------------------
pidp         |
   var(_cons)|   4.737455   .3739014                       4.05849    5.530008
------------------------------------------------------------------------------


predict pred, marginal

predict xb, xb
predict stdp, stdp // SE of linear prediction
generate lb = invlogit(xb - invnormal(0.975)*stdp)
generate ub = invlogit(xb + invnormal(0.975)*stdp)

preserve
    collapse (mean) outc pred lb ub [pweight=pweight], by(sty)
    list, noobs
restore

 +-------------------------------------------------+
  | sty       outc       pred         lb         ub |
  |-------------------------------------------------|
  |   1   .1268362   .1214299   .0298667    .044889 |
  |   2   .1237994   .1225502   .0310577   .0444801 |
  |   3   .1158695   .1249783    .032453   .0453627 |
  |   4   .1359755   .1287696   .0342034   .0474336 |
  |   5   .1431683   .1340086   .0365179   .0506257 |
  |-------------------------------------------------|
  |   6   .1447255   .1408075    .039633   .0549466 |
  |   7    .163366   .1493042   .0438101   .0605016 |
  |   8   .1717911   .1596582   .0493486   .0675113 |
  |   9   .1688607   .1720452   .0565906   .0763523 |
  |  10     .19648   .1866513   .0658945   .0876459 |
  |-------------------------------------------------|
  |  11   .2013144   .2036682   .0775555   .1024173 |
  |  12   .2112961   .2232932   .0917171   .1222753 |
  |  13    .238616   .2457393   .1084317   .1494319 |
  |  14    .260764   .2712533    .127895   .1864806 |

The observed and predicted % are very similar as expected in this example, by the CIs are way off.

Following this earlier post https://www.statalist.org/forums/for...tic-regression I also tried:

Code:

 
margins, at(sty = (1(1)14)) predict(mu fixedonly) saving(margins, replace)

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   .0633144   .0142823     4.43   0.000     .0353217    .0913072
          2  |   .0626769   .0122949     5.10   0.000     .0385793    .0867745
          3  |   .0620452     .01038     5.98   0.000     .0417008    .0823897
          4  |   .0614194   .0085593     7.18   0.000     .0446436    .0781952
          5  |   .0607994   .0068751     8.84   0.000     .0473245    .0742743
          6  |   .0601851   .0054156    11.11   0.000     .0495707    .0707996
          7  |   .0595766   .0043624    13.66   0.000     .0510263    .0681268
          8  |   .0589737   .0039959    14.76   0.000     .0511419    .0668055
          9  |   .0583764   .0044437    13.14   0.000     .0496669    .0670859
         10  |   .0577847   .0054764    10.55   0.000     .0470512    .0685183
         11  |   .0571986    .006807     8.40   0.000     .0438571      .07054
         12  |   .0566179    .008272     6.84   0.000      .040405    .0728308
         13  |   .0560427   .0097941     5.72   0.000     .0368466    .0752388
         14  |   .0554729   .0113358     4.89   0.000     .0332551    .0776906
------------------------------------------------------------------------------

but these results even further differ to the observed and predicted % as above, let alone the CIs. There must be something fundamental that I am missing here, but I can't see it.

Tags: None

Felix Bittmann

Join Date: Aug 2018

Posts: 693
#2

29 Nov 2024, 06:07

I suppose this has to do with your weights. Can you add the weight option to margins and see if it changes the outcome? I have tested it myself with a toy example and my results are rather close. Here, margins and the manual approach are similar. In general, I would use margins whenever possible instead of any manual computation.

Code:

clear all webuse bangladesh melogit c_use i.urban age child* || district: margins urban, predict(mu fixedonly) preserve predict pr, pr predict xb, xb predict stdp, stdp // SE of linear prediction generate lb = invlogit(xb - invnormal(0.975)*stdp) generate ub = invlogit(xb + invnormal(0.975)*stdp) collapse (mean) pr xb lb ub, by(urban) list, noobs restore

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment
Vicky Taxiarchi

Join Date: Dec 2021

Posts: 11
#3

29 Nov 2024, 08:09

Hi Felix,

Good thinking but it doesn't seem to be the issue. I tried the much simpler model

Code:

melogit outc sty || pidp:

and the problem still exists. My code runs fine though at the example you suggested so I wonder whether there is something wrong with my variables. I checked them all: the outcome is a binary 0/1, sty and sty2 are basically year since follow up continuous ranging from 1 to 14, sty2 the quadratic term and pidp a numerical variable to show participant id, so no obvious issue there either. What else could it be?

(Stata 16.1 MP too)
Comment
Felix Bittmann

Join Date: Aug 2018

Posts: 693
#4

29 Nov 2024, 09:38

I only see now that you use

Code:

predict pred, marginal

but I used

Code:

predict pr, pr

I am not sure why exactly you want to use marginal. I would need to dig into the manuals to understand what exactly the difference is. Maybe this is the source of the differences?

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment

Vicky Taxiarchi

Join Date: Dec 2021
Posts: 11

02 Dec 2024, 06:01

Apologies for the delay to come back on this. Interestingly the results are different and the predictions from

Code:

melogit outc sty sty2 || pidp: 

predict pred, pr
predict xb, xb
predict stdp, stdp // SE of linear prediction
generate lb = invlogit(xb - invnormal(0.975)*stdp)
generate ub = invlogit(xb + invnormal(0.975)*stdp)

preserve
    collapse (mean) outc pred lb ub, by(sty)
    list, noobs
restore


  +-------------------------------------------------+
  | sty       outc       pred         lb         ub |
  |-------------------------------------------------|
  |   1   .1258278   .0749258   .0298667    .044889 |
  |   2   .1168506   .0736026   .0310577   .0444801 |
  |   3   .1147279   .0807705    .032453   .0453627 |
  |   4   .1371539   .0866576   .0342034   .0474336 |
  |   5   .1382275   .0949788   .0365179   .0506257 |
  |-------------------------------------------------|
  |   6   .1423077   .1011016    .039633   .0549466 |
  |   7   .1601104   .1125178   .0438101   .0605016 |
  |   8   .1491345   .1114364   .0493486   .0675113 |
  |   9   .1627358   .1296432   .0565906   .0763523 |
  |  10   .1679842    .135501   .0658945   .0876459 |
  |-------------------------------------------------|
  |  11   .1972789   .1513917   .0775555   .1024173 |
  |  12   .1958224   .1675273   .0917171   .1222753 |
  |  13   .2523617   .1801488   .1084317   .1494319 |
  |  14    .246696   .1973812    .127895   .1864806 |
  +-------------------------------------------------+

are quite far away to observed values, although I was surprised by this result, I'd expect the average of individual predictions to be (at least similar) to the population average.
In any chance, I think my problem before was with the confidence intervals.

Comment

Felix Bittmann

Join Date: Aug 2018

Posts: 693
#6

02 Dec 2024, 06:15

I can only guess here. Either me model is misspecified in some way that distorts the prediction. Another guess: is sty2 a squared term? If so, you better specify c.sty##c.sty. This can affect how margins makes predictions.

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment

Announcement

melogit plot observed, predicted and confidence interval

Comment

Comment

Comment

Comment

Comment