Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • melogit plot observed, predicted and confidence interval

    Hello,

    I run a relatively simple melogit model aiming to create a plot with the observed outcome %, the predicted and the 95% confidence interval of predictions. Mainly my problem is in failing to extract the correct 95CIs. I apologise in advance for the long message, but I hope it is easy to follow.

    Code:
    melogit outc sty sty2 [pweight=pweight] || pidp: , vce(cluster psu)
    
    Fitting fixed-effects model:
    
    Iteration 0:   log likelihood = -8085.4625  
    Iteration 1:   log likelihood = -8075.9859  
    Iteration 2:   log likelihood = -8075.9713  
    Iteration 3:   log likelihood = -8075.9713  
    
    Refining starting values:
    
    Grid node 0:   log likelihood = -7812.3482
    
    Fitting full model:
    
    Iteration 0:   log pseudolikelihood = -7812.3482  
    Iteration 1:   log pseudolikelihood = -7570.5406  
    Iteration 2:   log pseudolikelihood = -7538.8878  
    Iteration 3:   log pseudolikelihood = -7537.6672  
    Iteration 4:   log pseudolikelihood = -7537.6738  
    Iteration 5:   log pseudolikelihood = -7537.6742  
    
    Mixed-effects logistic regression               Number of obs     =     18,648
    Group variable:            pidp                 Number of groups  =     10,958
    
                                                    Obs per group:
                                                                  min =          1
                                                                  avg =        1.7
                                                                  max =          4
    
    Integration method: mvaghermite                 Integration pts.  =          7
    
                                                    Wald chi2(2)      =     157.47
    Log pseudolikelihood = -7537.6742               Prob > chi2       =     0.0000
                                    (Std. Err. adjusted for 3,492 clusters in psu)
    ------------------------------------------------------------------------------
                 |               Robust
            outc |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             sty |  -.0110251   .0341653    -0.32   0.747    -.0779878    .0559377
            sty2 |   .0088006   .0023396     3.76   0.000      .004215    .0133862
           _cons |  -3.266937   .1279833   -25.53   0.000     -3.51778   -3.016095
    -------------+----------------------------------------------------------------
    pidp         |
       var(_cons)|   4.737455   .3739014                       4.05849    5.530008
    ------------------------------------------------------------------------------
    
    
    predict pred, marginal
    
    predict xb, xb
    predict stdp, stdp // SE of linear prediction
    generate lb = invlogit(xb - invnormal(0.975)*stdp)
    generate ub = invlogit(xb + invnormal(0.975)*stdp)
    
    preserve
        collapse (mean) outc pred lb ub [pweight=pweight], by(sty)
        list, noobs
    restore
    
     +-------------------------------------------------+
      | sty       outc       pred         lb         ub |
      |-------------------------------------------------|
      |   1   .1268362   .1214299   .0298667    .044889 |
      |   2   .1237994   .1225502   .0310577   .0444801 |
      |   3   .1158695   .1249783    .032453   .0453627 |
      |   4   .1359755   .1287696   .0342034   .0474336 |
      |   5   .1431683   .1340086   .0365179   .0506257 |
      |-------------------------------------------------|
      |   6   .1447255   .1408075    .039633   .0549466 |
      |   7    .163366   .1493042   .0438101   .0605016 |
      |   8   .1717911   .1596582   .0493486   .0675113 |
      |   9   .1688607   .1720452   .0565906   .0763523 |
      |  10     .19648   .1866513   .0658945   .0876459 |
      |-------------------------------------------------|
      |  11   .2013144   .2036682   .0775555   .1024173 |
      |  12   .2112961   .2232932   .0917171   .1222753 |
      |  13    .238616   .2457393   .1084317   .1494319 |
      |  14    .260764   .2712533    .127895   .1864806 |
    The observed and predicted % are very similar as expected in this example, by the CIs are way off.

    Following this earlier post https://www.statalist.org/forums/for...tic-regression I also tried:

    Code:
     
    margins, at(sty = (1(1)14)) predict(mu fixedonly) saving(margins, replace)
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             _at |
              1  |   .0633144   .0142823     4.43   0.000     .0353217    .0913072
              2  |   .0626769   .0122949     5.10   0.000     .0385793    .0867745
              3  |   .0620452     .01038     5.98   0.000     .0417008    .0823897
              4  |   .0614194   .0085593     7.18   0.000     .0446436    .0781952
              5  |   .0607994   .0068751     8.84   0.000     .0473245    .0742743
              6  |   .0601851   .0054156    11.11   0.000     .0495707    .0707996
              7  |   .0595766   .0043624    13.66   0.000     .0510263    .0681268
              8  |   .0589737   .0039959    14.76   0.000     .0511419    .0668055
              9  |   .0583764   .0044437    13.14   0.000     .0496669    .0670859
             10  |   .0577847   .0054764    10.55   0.000     .0470512    .0685183
             11  |   .0571986    .006807     8.40   0.000     .0438571      .07054
             12  |   .0566179    .008272     6.84   0.000      .040405    .0728308
             13  |   .0560427   .0097941     5.72   0.000     .0368466    .0752388
             14  |   .0554729   .0113358     4.89   0.000     .0332551    .0776906
    ------------------------------------------------------------------------------
    but these results even further differ to the observed and predicted % as above, let alone the CIs. There must be something fundamental that I am missing here, but I can't see it.


  • #2
    I suppose this has to do with your weights. Can you add the weight option to margins and see if it changes the outcome? I have tested it myself with a toy example and my results are rather close. Here, margins and the manual approach are similar. In general, I would use margins whenever possible instead of any manual computation.


    Code:
    clear all
    webuse bangladesh
    melogit c_use i.urban age child* || district:
    
    
    
    margins urban, predict(mu fixedonly)
    
    preserve
    predict pr, pr
    predict xb, xb
    predict stdp, stdp // SE of linear prediction
    generate lb = invlogit(xb - invnormal(0.975)*stdp)
    generate ub = invlogit(xb + invnormal(0.975)*stdp)
    collapse (mean) pr xb lb ub, by(urban)
    list, noobs
    restore
    Best wishes

    (Stata 16.1 MP)

    Comment


    • #3
      Hi Felix,

      Good thinking but it doesn't seem to be the issue. I tried the much simpler model

      Code:
      melogit outc sty || pidp:
      and the problem still exists. My code runs fine though at the example you suggested so I wonder whether there is something wrong with my variables. I checked them all: the outcome is a binary 0/1, sty and sty2 are basically year since follow up continuous ranging from 1 to 14, sty2 the quadratic term and pidp a numerical variable to show participant id, so no obvious issue there either. What else could it be?

      (Stata 16.1 MP too)

      Comment


      • #4
        I only see now that you use

        Code:
         
         predict pred, marginal
        but I used

        Code:
         
         predict pr, pr
        I am not sure why exactly you want to use marginal. I would need to dig into the manuals to understand what exactly the difference is. Maybe this is the source of the differences?
        Best wishes

        (Stata 16.1 MP)

        Comment


        • #5
          Apologies for the delay to come back on this. Interestingly the results are different and the predictions from

          Code:
          melogit outc sty sty2 || pidp: 
          
          predict pred, pr
          predict xb, xb
          predict stdp, stdp // SE of linear prediction
          generate lb = invlogit(xb - invnormal(0.975)*stdp)
          generate ub = invlogit(xb + invnormal(0.975)*stdp)
          
          preserve
              collapse (mean) outc pred lb ub, by(sty)
              list, noobs
          restore
          
          
            +-------------------------------------------------+
            | sty       outc       pred         lb         ub |
            |-------------------------------------------------|
            |   1   .1258278   .0749258   .0298667    .044889 |
            |   2   .1168506   .0736026   .0310577   .0444801 |
            |   3   .1147279   .0807705    .032453   .0453627 |
            |   4   .1371539   .0866576   .0342034   .0474336 |
            |   5   .1382275   .0949788   .0365179   .0506257 |
            |-------------------------------------------------|
            |   6   .1423077   .1011016    .039633   .0549466 |
            |   7   .1601104   .1125178   .0438101   .0605016 |
            |   8   .1491345   .1114364   .0493486   .0675113 |
            |   9   .1627358   .1296432   .0565906   .0763523 |
            |  10   .1679842    .135501   .0658945   .0876459 |
            |-------------------------------------------------|
            |  11   .1972789   .1513917   .0775555   .1024173 |
            |  12   .1958224   .1675273   .0917171   .1222753 |
            |  13   .2523617   .1801488   .1084317   .1494319 |
            |  14    .246696   .1973812    .127895   .1864806 |
            +-------------------------------------------------+
          are quite far away to observed values, although I was surprised by this result, I'd expect the average of individual predictions to be (at least similar) to the population average.
          In any chance, I think my problem before was with the confidence intervals.



          Comment


          • #6
            I can only guess here. Either me model is misspecified in some way that distorts the prediction. Another guess: is sty2 a squared term? If so, you better specify c.sty##c.sty. This can affect how margins makes predictions.
            Best wishes

            (Stata 16.1 MP)

            Comment

            Working...
            X