95% CI of Sen, Sp, PPV, NPV, and AUROC curve

Abdullah Algarni

Join Date: Jul 2022

Posts: 66
#1

95% CI of Sen, Sp, PPV, NPV, and AUROC curve

13 Oct 2023, 08:28

Dear colleagues,

I want to present the 95% confidence interval (95% CIs) of sensitivity, specificity, PPV, NPV, and AUROC curve.

I know that 95% CIs of sensitivity, specificity, PPV, and NPV are computed using Binominal exact distribution, and the 95% CI of the AUROC curve is calculated using asymptotic normality. However, one of the co-investigators in our research suggested that bias-corrected with acceleration clustered bootstrap is better than Binominal exact distribution and asymptotic normality, respectively. I don’t know which approach is more precise. Please help us choose the optimum strategy.

Thank you
Abdullah

Sincerely regards,
Abdullah Algarni
[email protected]
Tags: None
Bruce Weaver

Join Date: May 2014

Posts: 1119
#2

13 Oct 2023, 14:15

Hello Abdullah. The -roctab- command offers methods of computing a CI for AUC other than the asymptotic normal method. From the documentation:

By default, roctab calculates the standard error for the area under the curve by using an algorithm
suggested by DeLong, DeLong, and Clarke-Pearson (1988) and asymptotic normal confidence intervals.
Optionally, standard errors based on methods suggested by Bamber (1975) or Hanley and McNeil (1982)
can be computed by specifying bamber or hanley, respectively, and an exact binomial confidence
interval can be obtained by specifying binomial.

Perhaps your colleague would be satisfied with one of those methods?

Re Sens Spec, PV+ and PV- (the latter two are my preferred way of expressing PPV and NPV), do you mean for a particular cut-point that has been identified? Or are you talking about a much larger table showing these statistics for every possible cut-point?

Perhaps you can generate an example similar to what you are doing using the Hanley & McNeil (1982) data that is used in the examples for -roctab-:

Code:

clear use https://www.stata-press.com/data/r18/hanley

The latter measures are all binomial proportions. So I would be inclined to report Wilson (or Agresti-Coull) CIs for them. YMMV.

Cheers,
Bruce

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
1 like
Comment

Abdullah Algarni

Join Date: Jul 2022
Posts: 66

14 Oct 2023, 12:30

Thank you, Bruce!

Hello Abdullah. The -roctab- command offers methods of computing a CI for AUC other than the asymptotic normal method. From the documentation:

By default, roctab calculates the standard error for the area under the curve by using an algorithm
suggested by DeLong, DeLong, and Clarke-Pearson (1988) and asymptotic normal confidence intervals.
Optionally, standard errors based on methods suggested by Bamber (1975) or Hanley and McNeil (1982)
can be computed by specifying bamber or hanley, respectively, and an exact binomial confidence
interval can be obtained by specifying binomial.
Perhaps your colleague would be satisfied with one of those methods?

We will choose asymptotic normal confidence intervals.

Re Sens Spec, PV+ and PV- (the latter two are my preferred way of expressing PPV and NPV), do you mean for a particular cut-point that has been identified? Or are you talking about a much larger table showing these statistics for every possible cut-point?

Perhaps you can generate an example similar to what you are doing using the Hanley & McNeil (1982) data that is used in the examples for -roctab-:

Code:
clear use https://www.stata-press.com/data/r18/hanley
The latter measures are all binomial proportions. So I would be inclined to report Wilson (or Agresti-Coull) CIs for them. YMMV.

Yes! I meant for a particular cut-point that has been identified (qSOFA score ≥2). See my code below, please.

Code:

. roctab died30in qsofa2, detail table summary

   Patient |
      died |
    during |
in-hospita |
     l F/U |
   (30-day |        qsofa2
     max)? |        No        Yes |     Total
-----------+----------------------+----------
        No |     2,659        328 |     2,987 
       Yes |       166        121 |       287 
-----------+----------------------+----------
     Total |     2,825        449 |     3,274 

Detailed report of sensitivity and specificity
------------------------------------------------------------------------------
                                           Correctly
Cutpoint      Sensitivity   Specificity   classified          LR+          LR-
------------------------------------------------------------------------------
( >= No )         100.00%         0.00%        8.77%       1.0000     
( >= Yes )         42.16%        89.02%       84.91%       3.8394       0.6497
( >  Yes )          0.00%       100.00%       91.23%                    1.0000
------------------------------------------------------------------------------


                      ROC                     Asymptotic normal  
           Obs       area     Std. err.      [95% conf. interval]
     ------------------------------------------------------------
         3,274     0.6559       0.0149        0.62674     0.68506

Code:

. diagt died30in qsofa2

   Patient |
      died |
    during |
in-hospita |
     l F/U |
   (30-day |       qSOFA ≥2
     max)? |      Pos.       Neg. |     Total
-----------+----------------------+----------
  Abnormal |       121        166 |       287 
    Normal |       328      2,659 |     2,987 
-----------+----------------------+----------
     Total |       449      2,825 |     3,274 
True abnormal diagnosis defined as died30in = 1 (labelled Yes)


                                                  [95% Confidence Interval]
---------------------------------------------------------------------------
Prevalence                         Pr(A)      8.8%      7.8%      9.79%
---------------------------------------------------------------------------
Sensitivity                      Pr(+|A)     42.2%     36.4%     48.1%
Specificity                      Pr(-|N)       89%     87.8%     90.1%
ROC area               (Sens. + Spec.)/2      .656      .627      .685 
---------------------------------------------------------------------------
Likelihood ratio (+)     Pr(+|A)/Pr(+|N)      3.84      3.24      4.55 
Likelihood ratio (-)     Pr(-|A)/Pr(-|N)       .65      .588      .718 
Odds ratio                   LR(+)/LR(-)      5.91      4.55      7.67 
Positive predictive value        Pr(A|+)     26.9%     22.9%     31.3% 
Negative predictive value        Pr(N|-)     94.1%     93.2%       95% 
---------------------------------------------------------------------------

Sincerely regards,
Abdullah Algarni
[email protected]

Comment

Bruce Weaver

Join Date: May 2014
Posts: 1119

15 Oct 2023, 16:00

We will choose asymptotic normal confidence intervals.

Now I am confused. What about the co-investigator who is opposed to that? Would he/she be satisfied with one of the other methods, such as the Hanley & McNeil (1982) method? As the output from my code shows (see below), it is computed as AUC +/- SE*z_crit, just like the ordinary asymptotic normal method, but it uses a different SE. My code also shows one way to get the Wilson (or Agresti-Coull) CIs I would prefer for Sens, Spec, PV+ and PV-. I hope this helps.

Code:

* Generate Abdullah's data    
clear
input died30in qsofa2 n
0 0 2659
0 1 328
1 0 166
1 1 121
end
expand n
* We now have the data, at least for the 2 needed variables.
* Use the hanley option for -roctab-
roctab died30in qsofa2, detail table summary hanley
* Compute the Hanley & McNeil CI for AUC "by hand"
display _newline ///
"        AUC = " r(area) _newline ///
"Lower bound = " r(area) - r(se)*invnormal(.975) _newline ///
"Upper bound = " r(area) + r(se)*invnormal(.975)

* ssc install diagt // Uncomment to install -diagt- if necessary
diagt died30in qsofa2  

* Generate flag variables needed to get CIs for
* Sens, Spec, PV+ and PV- using -ci proportion-.
* The following commands assume 0/1 coding for both variables.
generate Sens  = qsofa2 if died30in
generate Spec  = !qsofa2 if !died30in
generate PVpos = died30in if qsofa2
generate PVneg = !died30in if !qsofa2

* Now use -ci proportion- to compute the CIs
* for these statistics.  The following foreach
* structure loops through the exact (binomial),
* Wilson, and Agresti-Coull methods.  Arguably,
* the latter two methods have advantages over
* the exact binomial CIs. I would likely report
* the Wilson CIs.  YMMV.

foreach method in "exact" "wilson" "agresti" {
 display _newline as result "CI method: " "`method'"    
 ci proportion Sens, `method'
 ci proportion Spec, `method'
 ci proportion PVpos, `method'  
 ci proportion PVneg, `method'  
}

PS - If I wanted to report CIs for the likelihood ratios, I would consider the MOVER-R method described here:

https://journals.sagepub.com/doi/ful...62280213502144

I have not yet found an implementation of it in Stata. But I have not really done a proper thorough search yet either.

Last edited by Bruce Weaver; 15 Oct 2023, 16:06. Reason: Added the PS.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)

Announcement