Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • 95% CI of Sen, Sp, PPV, NPV, and AUROC curve

    Dear colleagues,

    I want to present the 95% confidence interval (95% CIs) of sensitivity, specificity, PPV, NPV, and AUROC curve.

    I know that 95% CIs of sensitivity, specificity, PPV, and NPV are computed using Binominal exact distribution, and the 95% CI of the AUROC curve is calculated using asymptotic normality. However, one of the co-investigators in our research suggested that bias-corrected with acceleration clustered bootstrap is better than Binominal exact distribution and asymptotic normality, respectively. I don’t know which approach is more precise. Please help us choose the optimum strategy.

    Thank you
    Abdullah
    Sincerely regards,
    Abdullah Algarni
    [email protected]

  • #2
    Hello Abdullah. The -roctab- command offers methods of computing a CI for AUC other than the asymptotic normal method. From the documentation:

    By default, roctab calculates the standard error for the area under the curve by using an algorithm
    suggested by DeLong, DeLong, and Clarke-Pearson (1988) and asymptotic normal confidence intervals.
    Optionally, standard errors based on methods suggested by Bamber (1975) or Hanley and McNeil (1982)
    can be computed by specifying bamber or hanley, respectively, and an exact binomial confidence
    interval can be obtained by specifying binomial.
    Perhaps your colleague would be satisfied with one of those methods?

    Re Sens Spec, PV+ and PV- (the latter two are my preferred way of expressing PPV and NPV), do you mean for a particular cut-point that has been identified? Or are you talking about a much larger table showing these statistics for every possible cut-point?

    Perhaps you can generate an example similar to what you are doing using the Hanley & McNeil (1982) data that is used in the examples for -roctab-:

    Code:
    clear
    use https://www.stata-press.com/data/r18/hanley
    The latter measures are all binomial proportions. So I would be inclined to report Wilson (or Agresti-Coull) CIs for them. YMMV.

    Cheers,
    Bruce
    --
    Bruce Weaver
    Email: [email protected]
    Version: Stata/MP 18.5 (Windows)

    Comment


    • #3
      Thank you, Bruce!

      Hello Abdullah. The -roctab- command offers methods of computing a CI for AUC other than the asymptotic normal method. From the documentation:

      By default, roctab calculates the standard error for the area under the curve by using an algorithm
      suggested by DeLong, DeLong, and Clarke-Pearson (1988) and asymptotic normal confidence intervals.
      Optionally, standard errors based on methods suggested by Bamber (1975) or Hanley and McNeil (1982)
      can be computed by specifying bamber or hanley, respectively, and an exact binomial confidence
      interval can be obtained by specifying binomial.
      Perhaps your colleague would be satisfied with one of those methods?
      We will choose asymptotic normal confidence intervals.

      Re Sens Spec, PV+ and PV- (the latter two are my preferred way of expressing PPV and NPV), do you mean for a particular cut-point that has been identified? Or are you talking about a much larger table showing these statistics for every possible cut-point?

      Perhaps you can generate an example similar to what you are doing using the Hanley & McNeil (1982) data that is used in the examples for -roctab-:

      Code:
      clear use https://www.stata-press.com/data/r18/hanley
      The latter measures are all binomial proportions. So I would be inclined to report Wilson (or Agresti-Coull) CIs for them. YMMV.
      Yes! I meant for a particular cut-point that has been identified (qSOFA score ≥2). See my code below, please.
      Code:
      . roctab died30in qsofa2, detail table summary
      
         Patient |
            died |
          during |
      in-hospita |
           l F/U |
         (30-day |        qsofa2
           max)? |        No        Yes |     Total
      -----------+----------------------+----------
              No |     2,659        328 |     2,987 
             Yes |       166        121 |       287 
      -----------+----------------------+----------
           Total |     2,825        449 |     3,274 
      
      Detailed report of sensitivity and specificity
      ------------------------------------------------------------------------------
                                                 Correctly
      Cutpoint      Sensitivity   Specificity   classified          LR+          LR-
      ------------------------------------------------------------------------------
      ( >= No )         100.00%         0.00%        8.77%       1.0000     
      ( >= Yes )         42.16%        89.02%       84.91%       3.8394       0.6497
      ( >  Yes )          0.00%       100.00%       91.23%                    1.0000
      ------------------------------------------------------------------------------
      
      
                            ROC                     Asymptotic normal  
                 Obs       area     Std. err.      [95% conf. interval]
           ------------------------------------------------------------
               3,274     0.6559       0.0149        0.62674     0.68506
      Code:
      . diagt died30in qsofa2
      
         Patient |
            died |
          during |
      in-hospita |
           l F/U |
         (30-day |       qSOFA ≥2
           max)? |      Pos.       Neg. |     Total
      -----------+----------------------+----------
        Abnormal |       121        166 |       287 
          Normal |       328      2,659 |     2,987 
      -----------+----------------------+----------
           Total |       449      2,825 |     3,274 
      True abnormal diagnosis defined as died30in = 1 (labelled Yes)
      
      
                                                        [95% Confidence Interval]
      ---------------------------------------------------------------------------
      Prevalence                         Pr(A)      8.8%      7.8%      9.79%
      ---------------------------------------------------------------------------
      Sensitivity                      Pr(+|A)     42.2%     36.4%     48.1%
      Specificity                      Pr(-|N)       89%     87.8%     90.1%
      ROC area               (Sens. + Spec.)/2      .656      .627      .685 
      ---------------------------------------------------------------------------
      Likelihood ratio (+)     Pr(+|A)/Pr(+|N)      3.84      3.24      4.55 
      Likelihood ratio (-)     Pr(-|A)/Pr(-|N)       .65      .588      .718 
      Odds ratio                   LR(+)/LR(-)      5.91      4.55      7.67 
      Positive predictive value        Pr(A|+)     26.9%     22.9%     31.3% 
      Negative predictive value        Pr(N|-)     94.1%     93.2%       95% 
      ---------------------------------------------------------------------------
      Sincerely regards,
      Abdullah Algarni
      [email protected]

      Comment


      • #4
        We will choose asymptotic normal confidence intervals.
        Now I am confused. What about the co-investigator who is opposed to that? Would he/she be satisfied with one of the other methods, such as the Hanley & McNeil (1982) method? As the output from my code shows (see below), it is computed as AUC +/- SE*zcrit, just like the ordinary asymptotic normal method, but it uses a different SE. My code also shows one way to get the Wilson (or Agresti-Coull) CIs I would prefer for Sens, Spec, PV+ and PV-. I hope this helps.

        Code:
        * Generate Abdullah's data    
        clear
        input died30in qsofa2 n
        0 0 2659
        0 1 328
        1 0 166
        1 1 121
        end
        expand n
        * We now have the data, at least for the 2 needed variables.
        * Use the hanley option for -roctab-
        roctab died30in qsofa2, detail table summary hanley
        * Compute the Hanley & McNeil CI for AUC "by hand"
        display _newline ///
        "        AUC = " r(area) _newline ///
        "Lower bound = " r(area) - r(se)*invnormal(.975) _newline ///
        "Upper bound = " r(area) + r(se)*invnormal(.975)
        
        * ssc install diagt // Uncomment to install -diagt- if necessary
        diagt died30in qsofa2  
        
        * Generate flag variables needed to get CIs for
        * Sens, Spec, PV+ and PV- using -ci proportion-.
        * The following commands assume 0/1 coding for both variables.
        generate Sens  = qsofa2 if died30in
        generate Spec  = !qsofa2 if !died30in
        generate PVpos = died30in if qsofa2
        generate PVneg = !died30in if !qsofa2
        
        * Now use -ci proportion- to compute the CIs
        * for these statistics.  The following foreach
        * structure loops through the exact (binomial),
        * Wilson, and Agresti-Coull methods.  Arguably,
        * the latter two methods have advantages over
        * the exact binomial CIs. I would likely report
        * the Wilson CIs.  YMMV.
        
        foreach method in "exact" "wilson" "agresti" {
         display _newline as result "CI method: " "`method'"    
         ci proportion Sens, `method'
         ci proportion Spec, `method'
         ci proportion PVpos, `method'  
         ci proportion PVneg, `method'  
        }
        PS - If I wanted to report CIs for the likelihood ratios, I would consider the MOVER-R method described here: I have not yet found an implementation of it in Stata. But I have not really done a proper thorough search yet either.
        Last edited by Bruce Weaver; 15 Oct 2023, 16:06. Reason: Added the PS.
        --
        Bruce Weaver
        Email: [email protected]
        Version: Stata/MP 18.5 (Windows)

        Comment

        Working...
        X