Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calibration plot for melogit

    Hi,

    I am working on a multilevel logistic regression and want to produce a calibration plot. The usual commands such as estat gof and estat roc do not work for melogit.
    I've downloaded the user-written command hl but haven't been able to find a clear description of the syntax.

    Am I heading in the right direction, or is there a more appropriate command for melogit?

    Thanks!

  • #2
    I'm not familiar with -hl-, so I can't comment on it. You can do it from first principles as follows:

    Code:
    // RUN YOUR -melogit- FIRST
    predict phat, mu
    
    //    DIVIDE DATA INTO QUANTILES OF PREDICTED PROBABILITY
    //    HERE I USE DECILES TO ILLUSTRATE, BUT IF YOUR SAMPLE IS
    //    LARGE, USE A LARGER NUMBER OF GROUPS
    xtile quantile = phat, nq(10)
    collapse (sum) predicted = phat observed = outcome_variable_from_melogit, by(quantile)
    graph twoway connected observed predicted, sort
    This will give you a graph of the number of positive outcomes observed as a function of the number predicted in each predicted-risk quantile. The parts of the code in italics need to be customized to your particular data.

    Note: Not tested. Beware of typos.

    Comment


    • #3
      Thanks very much for this Clyde, greatly appreciated.

      As I have a sample size of 200, I went with your suggestion of decile as per your original code.

      Code:
      predict phat, mu
      xtile quantile = phat, nq(10)
      collapse (sum) predicted = phat observed = form_f1, by(quantile)
      graph twoway connected observed predicted, sort
      This was what I got. I don't have a huge amount of experience with calibration plots, but this wasn't totally what I was expecting, so am a little uncertain about interpretation.

      Comment


      • #4
        I'm afraid the graph you posted is not readable, at least not on my computer. In fact, there is no graph at all, just an icon that does nothing when clicked on. I don't know how you went about trying to post the picture, but the FAQ recommends that Stata graphs be exported to .png for posting. Posting Stata .gph files is almost always unsuccessful.

        Also, as my telepathic skills are mediocre, it would be helpful when you post back if you explain what you were expecting.

        Comment


        • #5
          Apologies Clyde! Thought I had used .png but must have accidently clicked .jpg in haste before a meeting.

          I do hope that this is better (i.e. visible!). In terms, of expections, I wasn't expecting a single line. However I am not as familiar with these plots, so it is not to say that this isn't how it should be expected to look.

          My apologies again and thank you for your support and advice!

          Click image for larger version

Name:	calibration plot.png
Views:	1
Size:	124.0 KB
ID:	1510519




          Comment


          • #6
            Well the line is actually somewhat illusory--really more decorative than informative. The connected points are the meat of it. The graph shows the relationship between the predicted number of positive outcomes and the number observed at each of the 10 deciles of predicted probability. If the model were perfectly calibrated, these points would fall exactly on the diagonal of the graph. To the extent they are above or below the diagonal, there is under- or over- prediction by the model at that level of predicted probability. Of course, there will be some random variation from the diagonal in any real data. The issue arises when there is a substantial departure, or systematic departures. For example, a model might systematically overpredict at low predicted probabilities and systematically underpredict at high ones. Or a model might systematically overpredict at the extremes and underpredict in the middle. These patterns, when they occur, sometimes give clues to how a model might be improved (e.g. by adding new variables, or interactions.)

            Often these curves are graphed with the diagonal on the graph as well, to make the closeness (or lack thereof) to the diagonal more apparent. If you prefer to do it that way, just change the final command to:
            Code:
            graph twoway (connected observed predicted, sort) (line predicted predicted, sort)


            Comment


            • #7
              Many thanks again! Yes, that final code is more what I was thinking of to create the plot.
              Really appreciate your insights.

              Comment


              • #8
                Dear Clyde,

                Following the above commands, I have plotted this curve from my data which includes 30-day mortality as outcome variable and frailty status as predictor. However, I am getting the scale from 160-220. Is it possible to get a scale 0 (0.05) 0.20?

                Thanks,
                Yogesh

                Attached Files

                Comment


                • #9
                  As is made clear in the Forum FAQ, attachments are, in general, discouraged, and attaching Office documents is particularly problematic because they can contain active malware. Many of us here will not download them from people we don't know. Even if I saw your graph, you have explained nothing about your data, nor about how the graph was created, so it is impossible to answer your question. Please post back. Save the graph in Stata's native .gph format and attach that to show the graph. Explain how the graph was created, particularly showing the exact code used. Also explain your data and why you think it would be desirable, even if possible, to have a scale of 0 to 0.2 when the actual data range from 160 to 220.

                  Comment


                  • #10
                    Apologies Clyde! I have used following commands to determine crude OR
                    melogit died_30 frail_2cat||hosp:,or
                    predict phat, mu
                    xtile quantile = phat, nq(10)
                    collapse (sum) predicted = phat observed = died_30, by(quantile)
                    graph twoway (connected observed predicted, sort) (line predicted predicted, sort)



                    Click image for larger version

Name:	Graph.png
Views:	2
Size:	53.6 KB
ID:	1641518
                    Attached Files

                    Comment


                    • #11
                      You can plot sum versus sum or mean versus mean. Those graphs should show similar shapes but different numbers on the axes.

                      A bigger puzzle is why you have so few points on the graph when your code asks for decile bins.
                      Last edited by Nick Cox; 19 Dec 2021, 07:20.

                      Comment


                      • #12

                        The outcome variable died_30 has 794 events coded as 1

                        After applying multivariable regression with following commands, I get this graph


                        melogit died_30 frail_2cat age sex charlson irsddecile ||hosp:,or
                        predict phat, mu
                        xtile quantile = phat, nq(10)
                        collapse (sum) predicted = phat observed = died_30, by(quantile)
                        graph twoway (connected observed predicted, sort) (line predicted predicted, sort)

                        Click image for larger version

Name:	Graph(2).png
Views:	2
Size:	73.4 KB
ID:	1641572
                        Attached Files

                        Comment


                        • #13
                          Elaborating on Nick Cox's advice in #11, if you prefer to have the calibration plot in terms of probabilities rather than N's, just change -(sum)- to -(mean)- in the -collapse- command. The graph will look exactly the same, except for the axis labels and titles.

                          Comment


                          • #14
                            Thanks Clyde and Nick,

                            I plan to generate 3 such graphs for 3 different outcome variables. Is there a STATA command to combine these 3 different graphs in a single figure for presentation rather than showing them individually, as seen below?

                            Click image for larger version

Name:	Figure.png
Views:	1
Size:	573.9 KB
ID:	1641598





                            Comment


                            • #15
                              Consult the [G] volume of the PDF documentation that comes installed with Stata to learn about the || operator that can be used to overlay multiple curves on the same set of axes.

                              That said, the 6 scatterplots you show in #14 will be almost entirely superimposed on each other if you overlay them. I think the result will be an unreadable mess. You can try it--maybe I'm wrong. But I think you will find the arrangement you currently have is better.

                              Comment

                              Working...
                              X