Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help to evaluate diagnostic performance of a predictive model.

    Dear Statalist members,

    As a part of a nomogram development, I've performed a logistic regression analysis to evaluate the diagnostic performance of a predictive model predicting muscle invasiveness at final pathology.

    This is the code I've used

    Code:
    //EAU tumor-related model (based only on tumor features)
    .
    . logistic muscle_invasive grade_bio clinicalt_high size_tumor_high_EAU multifocal variant_histology
    note: variant_histology != 0 predicts success perfectly;
          variant_histology omitted and 5 obs not used.
    
    
    Logistic regression                                     Number of obs =    419
                                                            LR chi2(4)    =  70.77
                                                            Prob > chi2   = 0.0000
    Log likelihood = -254.41337                             Pseudo R2     = 0.1221
    
    -------------------------------------------------------------------------------------
        muscle_invasive | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
    --------------------+----------------------------------------------------------------
              grade_bio |   4.532146   1.109442     6.17   0.000     2.805013    7.322727
         clinicalt_high |   3.461076   .8808343     4.88   0.000     2.101758    5.699537
    size_tumor_high_EAU |   1.054306   .2442513     0.23   0.819     .6695275    1.660216
             multifocal |    .790829   .2003062    -0.93   0.354     .4813764    1.299213
      variant_histology |          1  (omitted)
                  _cons |   .2788742   .0778792    -4.57   0.000     .1613241    .4820781
    -------------------------------------------------------------------------------------
    Note: _cons estimates baseline odds.
    
    .
    . lroc
    
    Logistic model for muscle_invasive
    
    Number of observations =      419
    Area under ROC curve   =   0.7150
    
    .
    . looclass muscle_invasive grade_bio clinicalt_high size_tumor_high_EAU multifocal variant_histology, model(logit) fig
    
    
    Iterating across (424) observations
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    ..................................................    50
    ..................................................   100
    ..................................................   150
    ..................................................   200
    ..................................................   250
    ..................................................   300
    ..................................................   350
    ..................................................   400
    ........................
    
    
    Classification Table for Full Data:
    
                  -------- True --------
    Classified |         D            ~D  |      Total
    -----------+--------------------------+-----------
         +     |       194           106  |        300
         -     |        32            92  |        124
    -----------+--------------------------+-----------
       Total   |       226           198  |        424
    
    
    
    Classification Table for Test Data:
    
                  -------- True --------
    Classified |         D            ~D  |      Total
    -----------+--------------------------+-----------
         +     |       187           121  |        308
         -     |        39            77  |        116
    -----------+--------------------------+-----------
       Total   |       226           198  |        424
    
    
    
    Classified + if predicted Pr(D) >= .5
    True D defined as  != 0
                                                Full         Test
    ----------------------------------------------------------------
    Sensitivity                     Pr( +| D)   85.84%       82.74%
    Specificity                     Pr( -|~D)   46.46%       38.89%
    Positive predictive value       Pr( D| +)   64.67%       60.71%
    Negative predictive value       Pr(~D| -)   74.19%       66.38%
    ----------------------------------------------------------------
    False + rate for true ~D        Pr( +|~D)   53.54%       61.11%
    False - rate for true D         Pr( -| D)   14.16%       17.26%
    False + rate for classified +   Pr(~D| +)   35.33%       39.29%
    False - rate for classified -   Pr( D| -)   25.81%       33.62%
    ----------------------------------------------------------------
    Correctly classified                        67.45%       62.26%
    ----------------------------------------------------------------
    ROC area                                    0.7150       0.6425
    ----------------------------------------------------------------
    p-value for Full vs Test ROC areas                       0.0000
    ----------------------------------------------------------------
    
    .
    . capture drop TumorModelEAU
    
    .
    . predict TumorModelEAU
    (option pr assumed; Pr(muscle_invasive))
    (1,139 missing values generated)
    
    . roctab muscle_invasive TumorModelEAU, detail
    
    Detailed report of sensitivity and specificity
    ------------------------------------------------------------------------------
                                               Correctly
    Cutpoint      Sensitivity   Specificity   classified          LR+          LR-
    ------------------------------------------------------------------------------
    ( >= .1799.. )    100.00%         0.00%       52.74%       1.0000    
    ( >= .1869.. )     98.19%         3.03%       53.22%       1.0126       0.5973
    ( >= .215941 )     97.74%         8.59%       55.61%       1.0692       0.2635
    ( >= .2239.. )     95.48%        13.13%       56.56%       1.0991       0.3446
    ( >= .4261.. )     90.50%        36.87%       65.16%       1.4335       0.2577
    ( >= .4376.. )     90.50%        37.37%       65.39%       1.4450       0.2542
    ( >= .482455 )     90.05%        38.89%       65.87%       1.4735       0.2560
    ( >= .4941.. )     87.78%        40.91%       65.63%       1.4856       0.2986
    ( >= .5034.. )     84.62%        44.44%       65.63%       1.5231       0.3462
    ( >= .5151.. )     82.35%        50.00%       67.06%       1.6471       0.3529
    ( >= .5599.. )     73.76%        56.06%       65.39%       1.6786       0.4681
    ( >= .5715.. )     58.82%        68.69%       63.48%       1.8786       0.5995
    ( >= .7743.. )     32.58%        91.92%       60.62%       4.0317       0.7335
    ( >= .7824.. )     32.13%        92.93%       60.86%       4.5436       0.7304
    ( >= .811598 )     23.98%        94.44%       57.28%       4.3167       0.8049
    ( >= .8186.. )     19.46%        95.96%       55.61%       4.8156       0.8393
    ( >  .8186.. )      0.00%       100.00%       47.26%                    1.0000
    ------------------------------------------------------------------------------
    
    
                          ROC                     Asymptotic normal  
               Obs       area     Std. err.      [95% conf. interval]
         ------------------------------------------------------------
               419     0.7141       0.0246        0.66587     0.76228
    
    .
    end of do-file
    
    . do "/var/folders/dk/cnyy1w7957ddykv9stjvy6gr0000gn/T//SD03496.000000"
    
    . cutpt  muscle_invasive TumorModelEAU, noadjust
    
    Empirical cutpoint estimation
    Method:                                Liu
    Reference variable:                    muscle_invasive (0=neg, 1=pos)
    Classification variable:               TumorModelEAU
    Empirical optimal cutpoint:            .5151754
    Sensitivity at cutpoint:               0.74
    Specificity at cutpoint:               0.56
    Area under ROC curve at cutpoint:      0.65
    
    .
    end of do-file
    And then I evaluated the net benefit at certain cut-off points using the command dca as follows:

    Code:
    dca muscle_invasive TumorModelEAU, prob(no) xstart(0.1) xstop(0.6) xby(0.05) nograph ///
     saving("DCA Tumor Model EAU.dta", replace)
     
    use "DCA Tumor Model EAU.dta", clear
     
    g advantage = TumorModelEAU - all
    I have a few unresolved issues about this:

    - Can I modify the cut-offs at which roctab, detail calculates specificity and sensitivity so they match with the cut-offs I adopted to calculate the net benefit "dca muscle_invasive TumorModelEAU, prob(no) xstart(0.1) xstop(0.6) xby(0.05) nograph"
    - How can I calculate the number of patients in which muscle-invasiveness would be missed at each cut-off? Is this by any chance the opposite of "Correctly classified"?
    - How can I calculate positive predictive value and negative predictive value at each cut-off point and at the optimal cut-off for the predictive model?

    Your help would be greatly appreciated.

    Thank you,
    Francesco
    Last edited by Francesco Ditonno; 31 Jan 2024, 17:52.
Working...
X