Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New package -cpt-, Optimal cut-points for empirical ROC curves and other ROC/AUC calculations, is available on the SSC

    Thanks to Kit Baum, a new package -cpt- is now available on SSC.

    It is a wrapper for the command -roctab- with some extra features.
    Based on the trick mentioned by Svend Juul, https://www.stata.com/statalist/arch.../msg00069.html, the -cpt- returns the AUC with a confidence interval using a logistic regression with at least one independent variable.
    Code:
    . webuse lbw
    (Hosmer & Lemeshow data)
    
    . cpt low i.smoke i.race
    AUC(%) = 65.0 [57.2; 72.8]
    The AUC and confidence interval are saved as text in a matrix
    Code:
    . matprint r(auc), decimals((0,2))
      
    -----------------------------------------------------
                                N   AUC    se  [95%   CI]
    -----------------------------------------------------
    Birthweight<2500g  p_low  189  0.65  0.04  0.57  0.73
    -----------------------------------------------------
    and as text in a macro
    Code:
    . di "`r(auctext)'"
    AUC(%) = 65.0 [57.2; 72.8]
    Also, the optimal cutpoints, Youden's J, and the Liu optimal cutpoint are found and saved in a matrix.
    Code:
    . matprint r(cutpt)
      
    -------------------------------------------------------------------------------
                   sensitivity  specificity   PPV   NPV  accuracy   lr+   lr-   AUC
    -------------------------------------------------------------------------------
    Youden(0.319)         0.93         0.31  0.38  0.91      0.50  1.35  0.22  0.62
    Liu(0.326)            0.51         0.66  0.41  0.75      0.61  1.50  0.74  0.59
    -------------------------------------------------------------------------------
    Note that based on the observed prevalence from the outcome, the PPV and the NPV are reported.
    The AUC in the matrix is the mean of the sensitivity and the specificity, as reported with confusion matrices.

    All cutpoints are reported in the matrix r(roc)
    Code:
    . matprint r(roc)
      
    ---------------------------------------------------------------------------
               sensitivity  specificity   PPV   NPV  accuracy   lr+   lr-   AUC
    ---------------------------------------------------------------------------
    >=0.137           1.00         0.00  0.31            0.31  1.00        0.50
    >=0.319 J         0.93         0.31  0.38  0.91      0.50  1.35  0.22  0.62
    >=0.325           0.85         0.39  0.39  0.85      0.53  1.39  0.39  0.62
    >=0.326 L         0.51         0.66  0.41  0.75      0.61  1.50  0.74  0.59
    >=0.589           0.19         0.92  0.50  0.71      0.69  2.20  0.89  0.55
    >=0.595           0.08         0.95  0.42  0.69      0.68  1.57  0.97  0.52
    >=0.595           0.00         1.00        0.69      0.69        1.00  0.50
    ---------------------------------------------------------------------------
    Further, true positive rates (tpr) and false positive rates (fpr) variables are generated such that one or more ROC curves can be graphed using the -twoway- command.
    Code:
    . list tpr_p_low fpr_p_low in 1/8
         +---------------------+
         | tpr_p_~w   fpr_p_~w |
         |---------------------|
      1. |     1.00       1.00 |
      2. |     0.93       0.69 |
      3. |     0.85       0.61 |
      4. |     0.51       0.34 |
      5. |     0.19       0.08 |
         |---------------------|
      6. |     0.08       0.05 |
      7. |     0.00       0.00 |
      8. |        .          . |
         +---------------------+
    A fast way to clean up when rerunning the -cpt- is to drop these variables by:
    Code:
    . drop ?pr_*
    Finally, with the option graph, a standard ROC curve is graphed.
    Code:
    . cpt low i.smoke i.race, graph
    AUC(%) = 65.0 [57.2; 72.8]
    Click image for larger version

Name:	roc.png
Views:	1
Size:	74.7 KB
ID:	1737428
    and the -twoway- code is saved for inspiration and further use
    Code:
    . di `"`r(graph_cmd) '"'
    twoway(line tpr_p_low fpr_p_low, connect(direct) lcolor(plb1) msymbol(i) lpattern(solid))(function y = x), xtitle("False positive rate (1-specificity)")ytitle("True positive rate (sen
    > sitivity)")xlabel(0 "0" .25 "25" .5 "50" .75 "75" 1 "100", labsize(small))ylabel(0 "0" .25 "25" .5 "50" .75 "75" 1 "100", labsize(small))legend(off) aspectratio(1) note("AUC(%) = 65
    > .0 [57.2; 72.8]", size(small))
    Enjoy
    Kind regards

    nhb

  • #2
    Thanks to Kit Baum, there is an update of the command -cpt- at the ssc.
    The new options are cross-validation and averaging a repetition of cross-validations.
    Cross-validation is a technique used to assess the performance of a model by dividing the dataset into multiple subsets (folds), typically of equal sizes.
    We use each fold in shifts as a validation set while we use the remaining folds for training.
    We reduce bias and obtain more precise estimates by repeating cross-validation multiple times.

    The code
    Code:
    . cpt low i.smoke i.race, replace cv(10) reps(20) graph
    AUC(%) = 55.7 [47.0; 64.3]
    randomly repeats 20 times splitting the dataset into 10 blocks and averaging the 20 repeated splits.
    The ROC curve becomes:
    Click image for larger version

Name:	cpt.png
Views:	1
Size:	52.3 KB
ID:	1743746
    Last edited by Niels Henrik Bruun; 19 Feb 2024, 03:55.
    Kind regards

    nhb

    Comment


    • #3
      Once again, thanks to Kit Baum. A minor error had occurred, and the quick fixes were up the same day. Fantastic!!
      Kind regards

      nhb

      Comment


      • #4
        Thanks to Kit Baum, there is an update of the command -cpt- at the ssc.
        New features are an option for setting the seed for cross-validation and better linear interpolation for finding the optimal cutpoints in the case of one continuous predictor.
        Kind regards

        nhb

        Comment


        • #5
          Thanks to Kit Baum, there is an update of the command -cpt- at the ssc. A bug fix.
          Kind regards

          nhb

          Comment

          Working...
          X