New package -cpt-, Optimal cut-points for empirical ROC curves and other ROC/AUC calculations, is available on the SSC

Niels Henrik Bruun

Join Date: Aug 2014
Posts: 555

New package -cpt-, Optimal cut-points for empirical ROC curves and other ROC/AUC calculations, is available on the SSC

18 Dec 2023, 00:10

Thanks to Kit Baum, a new package -cpt- is now available on SSC.

It is a wrapper for the command -roctab- with some extra features.
Based on the trick mentioned by Svend Juul, https://www.stata.com/statalist/arch.../msg00069.html, the -cpt- returns the AUC with a confidence interval using a logistic regression with at least one independent variable.

Code:

. webuse lbw
(Hosmer & Lemeshow data)

. cpt low i.smoke i.race
AUC(%) = 65.0 [57.2; 72.8]

The AUC and confidence interval are saved as text in a matrix

Code:

. matprint r(auc), decimals((0,2))
  
-----------------------------------------------------
                            N   AUC    se  [95%   CI]
-----------------------------------------------------
Birthweight<2500g  p_low  189  0.65  0.04  0.57  0.73
-----------------------------------------------------

and as text in a macro

Code:

. di "`r(auctext)'"
AUC(%) = 65.0 [57.2; 72.8]

Also, the optimal cutpoints, Youden's J, and the Liu optimal cutpoint are found and saved in a matrix.

Code:

. matprint r(cutpt)
  
-------------------------------------------------------------------------------
               sensitivity  specificity   PPV   NPV  accuracy   lr+   lr-   AUC
-------------------------------------------------------------------------------
Youden(0.319)         0.93         0.31  0.38  0.91      0.50  1.35  0.22  0.62
Liu(0.326)            0.51         0.66  0.41  0.75      0.61  1.50  0.74  0.59
-------------------------------------------------------------------------------

Note that based on the observed prevalence from the outcome, the PPV and the NPV are reported.
The AUC in the matrix is the mean of the sensitivity and the specificity, as reported with confusion matrices.

All cutpoints are reported in the matrix r(roc)

Code:

. matprint r(roc)
  
---------------------------------------------------------------------------
           sensitivity  specificity   PPV   NPV  accuracy   lr+   lr-   AUC
---------------------------------------------------------------------------
>=0.137           1.00         0.00  0.31            0.31  1.00        0.50
>=0.319 J         0.93         0.31  0.38  0.91      0.50  1.35  0.22  0.62
>=0.325           0.85         0.39  0.39  0.85      0.53  1.39  0.39  0.62
>=0.326 L         0.51         0.66  0.41  0.75      0.61  1.50  0.74  0.59
>=0.589           0.19         0.92  0.50  0.71      0.69  2.20  0.89  0.55
>=0.595           0.08         0.95  0.42  0.69      0.68  1.57  0.97  0.52
>=0.595           0.00         1.00        0.69      0.69        1.00  0.50
---------------------------------------------------------------------------

Further, true positive rates (tpr) and false positive rates (fpr) variables are generated such that one or more ROC curves can be graphed using the -twoway- command.

Code:

. list tpr_p_low fpr_p_low in 1/8
     +---------------------+
     | tpr_p_~w   fpr_p_~w |
     |---------------------|
  1. |     1.00       1.00 |
  2. |     0.93       0.69 |
  3. |     0.85       0.61 |
  4. |     0.51       0.34 |
  5. |     0.19       0.08 |
     |---------------------|
  6. |     0.08       0.05 |
  7. |     0.00       0.00 |
  8. |        .          . |
     +---------------------+

A fast way to clean up when rerunning the -cpt- is to drop these variables by:

Code:

. drop ?pr_*

Finally, with the option graph, a standard ROC curve is graphed.

Code:

. cpt low i.smoke i.race, graph
AUC(%) = 65.0 [57.2; 72.8]

Click image for larger version

Name: roc.png
Views: 1
Size: 74.7 KB
ID: 1737428

and the -twoway- code is saved for inspiration and further use

Code:

. di `"`r(graph_cmd) '"'
twoway(line tpr_p_low fpr_p_low, connect(direct) lcolor(plb1) msymbol(i) lpattern(solid))(function y = x), xtitle("False positive rate (1-specificity)")ytitle("True positive rate (sen
> sitivity)")xlabel(0 "0" .25 "25" .5 "50" .75 "75" 1 "100", labsize(small))ylabel(0 "0" .25 "25" .5 "50" .75 "75" 1 "100", labsize(small))legend(off) aspectratio(1) note("AUC(%) = 65
> .0 [57.2; 72.8]", size(small))

Enjoy

Kind regards

nhb

Tags: None

Niels Henrik Bruun

Join Date: Aug 2014

Posts: 555
#2

19 Feb 2024, 02:52

Thanks to Kit Baum, there is an update of the command -cpt- at the ssc.
The new options are cross-validation and averaging a repetition of cross-validations.
Cross-validation is a technique used to assess the performance of a model by dividing the dataset into multiple subsets (folds), typically of equal sizes.
We use each fold in shifts as a validation set while we use the remaining folds for training.
We reduce bias and obtain more precise estimates by repeating cross-validation multiple times.

The code

Code:

. cpt low i.smoke i.race, replace cv(10) reps(20) graph AUC(%) = 55.7 [47.0; 64.3]

randomly repeats 20 times splitting the dataset into 10 blocks and averaging the 20 repeated splits.
The ROC curve becomes:

Last edited by Niels Henrik Bruun; 19 Feb 2024, 02:55.

Kind regards

nhb
2 likes
Comment
Niels Henrik Bruun

Join Date: Aug 2014

Posts: 555
#3

20 Feb 2024, 05:54

Once again, thanks to Kit Baum. A minor error had occurred, and the quick fixes were up the same day. Fantastic!!

Kind regards

nhb
Comment
Niels Henrik Bruun

Join Date: Aug 2014

Posts: 555
#4

12 Mar 2024, 23:48

Thanks to Kit Baum, there is an update of the command -cpt- at the ssc.
New features are an option for setting the seed for cross-validation and better linear interpolation for finding the optimal cutpoints in the case of one continuous predictor.

Kind regards

nhb
1 like
Comment
Niels Henrik Bruun

Join Date: Aug 2014

Posts: 555
#5

17 Mar 2024, 02:14

Thanks to Kit Baum, there is an update of the command -cpt- at the ssc. A bug fix.

Kind regards

nhb
1 like
Comment

Announcement

New package -cpt-, Optimal cut-points for empirical ROC curves and other ROC/AUC calculations, is available on the SSC

Comment

Comment

Comment

Comment