Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sensitivity, Specificity, Positive predictive value, Negative value, Younden Index

    Dear Experts,
    The objective of my study is to investigate whether preoperative shock index (continuous variable) predicts hypotension (binary outcome). I want to find out sensitivity, specificity, PPV, NPV, and Younden Index. I am using STATA/IC 15.1
    thank you.
    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(serialno hypo_before_del) double shock_index
    30 1  .9117647058823529
    31 0  .6891891891891891
    32 0 1.0869565217391304
    33 0  .7969924812030075
    34 0               .712
    35 1  .9322033898305084
    36 0   .853448275862069
    37 0  .7777777777777778
    38 1  .8333333333333334
    39 0  .6830985915492958
    40 1  .5895522388059702
    41 0  1.045045045045045
    42 1  .9166666666666666
    43 0  .8545454545454545
    44 1  .8560606060606061
    45 1  .8333333333333334
    46 0  .7739130434782608
    47 0  .5636363636363636
    48 0  .7241379310344828
    49 0  .6131386861313869
    50 0  .6124031007751938
    51 1  .6956521739130435
    52 1  .5606060606060606
    53 0  .8032786885245902
    54 0   .782258064516129
    55 1  .9448818897637795
    56 0  .6538461538461539
    57 1           .6953125
    58 0  .8981481481481481
    59 0  .6271186440677966
    end
    ------------------ copy up to and including the previous line ------------------


  • #2
    Asish:
    you may want to consider:
    Code:
    . logistic hypo_before_del shock_index, vce(cluster serialno)
    
    Logistic regression                                     Number of obs =     30
                                                            Wald chi2(1)  =   0.32
                                                            Prob > chi2   = 0.5704
    Log pseudolikelihood = -19.529179                       Pseudo R2     = 0.0094
    
                                     (Std. err. adjusted for 30 clusters in serialno)
    ---------------------------------------------------------------------------------
                    |               Robust
    hypo_before_del | Odds ratio   std. err.      z    P>|z|     [95% conf. interval]
    ----------------+----------------------------------------------------------------
        shock_index |   5.482709   16.44056     0.57   0.570     .0153662    1956.251
              _cons |   .1531845   .3646242    -0.79   0.431     .0014425    16.26767
    ---------------------------------------------------------------------------------
    Note: _cons estimates baseline odds.
    
    . estat classification
    
    Logistic model for hypo_before_del
    
                  -------- True --------
    Classified |         D            ~D  |      Total
    -----------+--------------------------+-----------
         +     |         0             0  |          0
         -     |        11            19  |         30
    -----------+--------------------------+-----------
       Total   |        11            19  |         30
    
    Classified + if predicted Pr(D) >= .5
    True D defined as hypo_before_del != 0
    --------------------------------------------------
    Sensitivity                     Pr( +| D)    0.00%
    Specificity                     Pr( -|~D)  100.00%
    Positive predictive value       Pr( D| +)       .%
    Negative predictive value       Pr(~D| -)   63.33%
    --------------------------------------------------
    False + rate for true ~D        Pr( +|~D)    0.00%
    False - rate for true D         Pr( -| D)  100.00%
    False + rate for classified +   Pr(~D| +)       .%
    False - rate for classified -   Pr( D| -)   36.67%
    --------------------------------------------------
    Correctly classified                        63.33%
    --------------------------------------------------
    
    .
    It's trivial to notice that a single predictor makes your regression misspecified.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      It is not meaningful to speak of sensitivity, specificity, NPV or PPV in the context of a continuous predictor. Those parameters are only meaningful once you pick a cutoff value for the continuous predictor: then you can define the operating characteristics for the dichotomous predictor corresponding to greater than vs less than the cutoff. The -estat classification- command recommended in #2 will, by default, use a cutoff of 0.5 predicted probability. That is seldom useful in real life. -estat classification- does have a -cutoff()- option that allows you to specify that threshold of predicted probability that you want to use. In your context it probably makes sense to first run -lroc- (after the logistic regression) to see a graph of sensitivity vs (1 minus) specificity: this will enable you to identify a range of values for the cutoff that produce reasonable values of sensitivity and specificity. Then you can run -estat classification- a few times with selected cutoffs to get quantitative estimates of those characteristics of the test operated at those cutoffs.

      As for the Youden statistic, it is simply the sum of sensitivity and specificity. It is widely used in the medical literature; it is also meaningless and useless in nearly all realistic scenarios.

      Comment


      • #4
        Carlo Lazzaro Clyde Schechter, According to your suggestions I did the following and found .80 as the cutoff and did the subsequent analysis. Please correct me if there are any mistakes.

        . dataex hypo_before_del shock_index

        ----------------------- copy starting from the next line -----------------------
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input byte hypo_before_del double shock_index
        0                 .8
        0            .640625
        1 .49264705882352944
        1            .765625
        1  .8455284552845529
        0  .8016528925619835
        1  .8548387096774194
        0  .9333333333333333
        1  1.016260162601626
        0  .9158878504672897
        1  .8688524590163934
        1               .825
        0  .7465753424657534
        0  .9619047619047619
        1  .7874015748031497
        1  .6691729323308271
        0  .8761061946902655
        1  .9370629370629371
        0  .6521739130434783
        1  .8793103448275862
        1               .824
        1  .8916666666666667
        0           .6015625
        0  .7301587301587301
        0  .5857142857142857
        0  .7166666666666667
        1  .6611570247933884
        0   .849624060150376
        0                  1
        .                  .
        .                  .
        end
        ------------------ copy up to and including the previous line ------------------

        . roctab hypo_before_del shock_index, detail

        Detailed report of sensitivity and specificity

        Correctly
        Cutpoint Sensitivity Specificity Classified LR+ LR-

        ( >= .4926.. ) 100.00% 0.00% 48.28% 1.0000
        ( >= .5857.. ) 92.86% 0.00% 44.83% 0.9286
        ( >= .6015.. ) 92.86% 6.67% 48.28% 0.9949 1.0714
        ( >= .640625 ) 92.86% 13.33% 51.72% 1.0714 0.5357
        ( >= .6521.. ) 92.86% 20.00% 55.17% 1.1607 0.3571
        ( >= .661157 ) 92.86% 26.67% 58.62% 1.2662 0.2679
        ( >= .6691.. ) 85.71% 26.67% 55.17% 1.1688 0.5357
        ( >= .7166.. ) 78.57% 26.67% 51.72% 1.0714 0.8036
        ( >= .7301.. ) 78.57% 33.33% 55.17% 1.1786 0.6429
        ( >= .7465.. ) 78.57% 40.00% 58.62% 1.3095 0.5357
        ( >= .765625 ) 78.57% 46.67% 62.07% 1.4732 0.4592
        ( >= .7874.. ) 71.43% 46.67% 58.62% 1.3393 0.6122
        ( >= .8 ) 64.29% 46.67% 55.17% 1.2054 0.7653
        ( >= .8016.. ) 64.29% 53.33% 58.62% 1.3776 0.6696
        ( >= .824 ) 64.29% 60.00% 62.07% 1.6071 0.5952
        ( >= .825 ) 57.14% 60.00% 58.62% 1.4286 0.7143
        ( >= .8455.. ) 50.00% 60.00% 55.17% 1.2500 0.8333
        ( >= .8496.. ) 42.86% 60.00% 51.72% 1.0714 0.9524
        ( >= .8548.. ) 42.86% 66.67% 55.17% 1.2857 0.8571
        ( >= .8688.. ) 35.71% 66.67% 51.72% 1.0714 0.9643
        ( >= .8761.. ) 28.57% 66.67% 48.28% 0.8571 1.0714
        ( >= .8793.. ) 28.57% 73.33% 51.72% 1.0714 0.9740
        ( >= .8916.. ) 21.43% 73.33% 48.28% 0.8036 1.0714
        ( >= .9158.. ) 14.29% 73.33% 44.83% 0.5357 1.1688
        ( >= .9333.. ) 14.29% 80.00% 48.28% 0.7143 1.0714
        ( >= .9370.. ) 14.29% 86.67% 51.72% 1.0714 0.9890
        ( >= .9619.. ) 7.14% 86.67% 48.28% 0.5357 1.0714
        ( >= 1 ) 7.14% 93.33% 51.72% 1.0714 0.9949
        ( >= 1.01626 ) 7.14% 100.00% 55.17% 0.9286
        ( > 1.01626 ) 0.00% 100.00% 51.72% 1.0000


        From these observations, I found .80 as the cutoff
        Next, I categorized them


        gen shockindex=1

        replace shockindex=0 if shock_index<=0.80

        then I get the following results



        . roctab hypo_before_del shockindex, detail

        Detailed report of sensitivity and specificity

        Correctly
        Cutpoint Sensitivity Specificity Classified LR+ LR-

        ( >= 0 ) 100.00% 0.00% 48.28% 1.0000
        ( >= 1 ) 64.29% 53.33% 58.62% 1.3776 0.6696
        ( > 1 ) 0.00% 100.00% 51.72% 1.0000



        ROC -Asymptotic Normal--
        Obs Area Std. Err. [95% Conf. Interval]

        29 0.5881 0.0941 0.40361 0.77258


        Next,


        . logistic hypo_before_del shockindex

        Logistic regression Number of obs = 29
        LR chi2(1) = 0.91
        Prob > chi2 = 0.3389
        Log likelihood = -19.626647 Pseudo R2 = 0.0228


        hypo_before_del Odds Ratio Std. Err. z P>z [95% Conf. Interval]

        shockindex 2.057143 1.565279 0.95 0.343 .4630048 9.139941
        _cons .625 .3563048 -0.82 0.410 .2044657 1.910467


        . estat classification

        Logistic model for hypo_before_del

        True --------
        Classified D ~D Total

        9 7 16
        5 8 13

        Total 14 15 29

        Classified + if predicted Pr(D) >= .5
        True D defined as hypo_before_del != 0

        Sensitivity Pr( + D) 64.29%
        Specificity Pr( -~D) 53.33%
        Positive predictive value Pr( D +) 56.25%
        Negative predictive value Pr(~D -) 61.54%

        False + rate for true ~D Pr( +~D) 46.67%
        False - rate for true D Pr( - D) 35.71%
        False + rate for classified + Pr(~D +) 43.75%
        False - rate for classified - Pr( D -) 38.46%

        Correctly classified 58.62%


        As per your feedback, I used a cutoff of 0.80 as the predicted probability.

        . estat classification, cutoff(.80)

        Logistic model for hypo_before_del

        True --------
        Classified D ~D Total

        0 0 0
        14 15 29

        Total 14 15 29

        Classified + if predicted Pr(D) >= .8
        True D defined as hypo_before_del != 0

        Sensitivity Pr( + D) 0.00%
        Specificity Pr( -~D) 100.00%
        Positive predictive value Pr( D +) .%
        Negative predictive value Pr(~D -) 51.72%

        False + rate for true ~D Pr( +~D) 0.00%
        False - rate for true D Pr( - D) 100.00%
        False + rate for classified + Pr(~D +) .%
        False - rate for classified - Pr( D -) 48.28%

        Correctly classified 51.72%

        Now, regarding the estat classification which is the correct one?


        . lroc

        Logistic model for hypo_before_del

        number of observations = 29
        area under ROC curve = 0.5881

        regards,
        Asish


        Comment


        • #5
          Carlo Lazzaro Clyde Schechter, I have added the output in a better way in the post below.
          Last edited by Asish Subedi; 25 Dec 2021, 08:11.

          Comment


          • #6
            @Carlo Lazzaro @Clyde Schechter, According to your suggestions I did the following and found .80 as the cutoff and did the subsequent analysis. Please correct me if there are any mistakes.


            . dataex hypo_before_del shock_index

            ----------------------- copy starting from the next line -----------------------
            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input byte hypo_before_del double shock_index
            0                 .8
            0            .640625
            1 .49264705882352944
            1            .765625
            1  .8455284552845529
            0  .8016528925619835
            1  .8548387096774194
            0  .9333333333333333
            1  1.016260162601626
            0  .9158878504672897
            1  .8688524590163934
            1               .825
            0  .7465753424657534
            0  .9619047619047619
            1  .7874015748031497
            1  .6691729323308271
            0  .8761061946902655
            1  .9370629370629371
            0  .6521739130434783
            1  .8793103448275862
            1               .824
            1  .8916666666666667
            0           .6015625
            0  .7301587301587301
            0  .5857142857142857
            0  .7166666666666667
            1  .6611570247933884
            0   .849624060150376
            0                  1
            .                  .
            .                  .
            end
            ------------------ copy up to and including the previous line ------------------




            Code:
               roctab hypo_before_del shock_index, detail
            
            Detailed report of sensitivity and specificity
            
            Correctly
            Cutpoint Sensitivity Specificity Classified LR+ LR-
            
            ( >= .4926.. ) 100.00% 0.00% 48.28% 1.0000
            ( >= .5857.. ) 92.86% 0.00% 44.83% 0.9286
            ( >= .6015.. ) 92.86% 6.67% 48.28% 0.9949 1.0714
            ( >= .640625 ) 92.86% 13.33% 51.72% 1.0714 0.5357
            ( >= .6521.. ) 92.86% 20.00% 55.17% 1.1607 0.3571
            ( >= .661157 ) 92.86% 26.67% 58.62% 1.2662 0.2679
            ( >= .6691.. ) 85.71% 26.67% 55.17% 1.1688 0.5357
            ( >= .7166.. ) 78.57% 26.67% 51.72% 1.0714 0.8036
            ( >= .7301.. ) 78.57% 33.33% 55.17% 1.1786 0.6429
            ( >= .7465.. ) 78.57% 40.00% 58.62% 1.3095 0.5357
            ( >= .765625 ) 78.57% 46.67% 62.07% 1.4732 0.4592
            ( >= .7874.. ) 71.43% 46.67% 58.62% 1.3393 0.6122
            ( >= .8 ) 64.29% 46.67% 55.17% 1.2054 0.7653
            ( >= .8016.. ) 64.29% 53.33% 58.62% 1.3776 0.6696
            ( >= .824 ) 64.29% 60.00% 62.07% 1.6071 0.5952
            ( >= .825 ) 57.14% 60.00% 58.62% 1.4286 0.7143
            ( >= .8455.. ) 50.00% 60.00% 55.17% 1.2500 0.8333
            ( >= .8496.. ) 42.86% 60.00% 51.72% 1.0714 0.9524
            ( >= .8548.. ) 42.86% 66.67% 55.17% 1.2857 0.8571
            ( >= .8688.. ) 35.71% 66.67% 51.72% 1.0714 0.9643
            ( >= .8761.. ) 28.57% 66.67% 48.28% 0.8571 1.0714
            ( >= .8793.. ) 28.57% 73.33% 51.72% 1.0714 0.9740
            ( >= .8916.. ) 21.43% 73.33% 48.28% 0.8036 1.0714
            ( >= .9158.. ) 14.29% 73.33% 44.83% 0.5357 1.1688
            ( >= .9333.. ) 14.29% 80.00% 48.28% 0.7143 1.0714
            ( >= .9370.. ) 14.29% 86.67% 51.72% 1.0714 0.9890
            ( >= .9619.. ) 7.14% 86.67% 48.28% 0.5357 1.0714
            ( >= 1 ) 7.14% 93.33% 51.72% 1.0714 0.9949
            ( >= 1.01626 ) 7.14% 100.00% 55.17% 0.9286
            ( > 1.01626 ) 0.00% 100.00% 51.72% 1.0000
            From these observations, I found .80 as the cutoff
            Next, I categorized them


            gen shockindex=1

            replace shockindex=0 if shock_index<=0.80

            then I get the following results


            Code:
             roctab hypo_before_del shockindex, detail
            
            Detailed report of sensitivity and specificity
            
            Correctly
            Cutpoint Sensitivity Specificity Classified LR+ LR-
            
            ( >= 0 ) 100.00% 0.00% 48.28% 1.0000
            ( >= 1 ) 64.29% 53.33% 58.62% 1.3776 0.6696
            ( > 1 ) 0.00% 100.00% 51.72% 1.0000
            
            
            
            ROC -Asymptotic Normal--
            Obs Area Std. Err. [95% Conf. Interval]
            
            29 0.5881 0.0941 0.40361 0.77258
            Next,

            Code:
            . logistic hypo_before_del shockindex
            
            Logistic regression Number of obs = 29
            LR chi2(1) = 0.91
            Prob > chi2 = 0.3389
            Log likelihood = -19.626647 Pseudo R2 = 0.0228
            
            
            hypo_before_del Odds Ratio Std. Err. z P>z [95% Conf. Interval]
            
            shockindex 2.057143 1.565279 0.95 0.343 .4630048 9.139941
            _cons .625 .3563048 -0.82 0.410 .2044657 1.910467

            Code:
             estat classification
            
            Logistic model for hypo_before_del
            
            True --------
            Classified D ~D Total
            
            9 7 16
            5 8 13
            
            Total 14 15 29
            
            Classified + if predicted Pr(D) >= .5
            True D defined as hypo_before_del != 0
            
            Sensitivity Pr( + D) 64.29%
            Specificity Pr( -~D) 53.33%
            Positive predictive value Pr( D +) 56.25%
            Negative predictive value Pr(~D -) 61.54%
            
            False + rate for true ~D Pr( +~D) 46.67%
            False - rate for true D Pr( - D) 35.71%
            False + rate for classified + Pr(~D +) 43.75%
            False - rate for classified - Pr( D -) 38.46%
            
            Correctly classified 58.62%
            As per your feedback, I used a cutoff of 0.80 as the predicted probability.


            Code:
             estat classification, cutoff(.80)
            
            Logistic model for hypo_before_del
            
            True --------
            Classified D ~D Total
            
            0 0 0
            14 15 29
            
            Total 14 15 29
            
            Classified + if predicted Pr(D) >= .8
            True D defined as hypo_before_del != 0
            
            Sensitivity Pr( + D) 0.00%
            Specificity Pr( -~D) 100.00%
            Positive predictive value Pr( D +) .%
            Negative predictive value Pr(~D -) 51.72%
            
            False + rate for true ~D Pr( +~D) 0.00%
            False - rate for true D Pr( - D) 100.00%
            False + rate for classified + Pr(~D +) .%
            False - rate for classified - Pr( D -) 48.28%
            
            Correctly classified 51.72%

            Now, regarding the estat classification which is the correct one?

            Code:
             lroc
            
            Logistic model for hypo_before_del
            
            number of observations = 29
            area under ROC curve = 0.5881

            regards,
            Asish Subedi
            Last edited by Asish Subedi; 25 Dec 2021, 08:14.

            Comment


            • #7
              You are getting contradictory results because you are confusing two different cutoffs. In your raw data, analyzed with -roctab- the only cutoff that is under consideration is the value of shock_index, which you chose to set at 0.8. Fine.

              Then you create a dichotomous variable--but you did it incorrectly. You have missing values for shock_index, and the way you created shockindex, it gets set to 1 when shock_index is missing. If you did it correctly, you would see that the dichotomous variable exactly reproduces the results you get from selecting the 0.8 cutoff in the first -roctab- output:

              Code:
              . roctab hypo_before_del shock_index, detail
              
              Detailed report of sensitivity and specificity
              ------------------------------------------------------------------------------
                                                         Correctly
              Cutpoint      Sensitivity   Specificity   classified          LR+          LR-
              ------------------------------------------------------------------------------
              ( >= .4926.. )    100.00%         0.00%       48.28%       1.0000    
              ( >= .5857.. )     92.86%         0.00%       44.83%       0.9286    
              ( >= .6015.. )     92.86%         6.67%       48.28%       0.9949       1.0714
              ( >= .640625 )     92.86%        13.33%       51.72%       1.0714       0.5357
              ( >= .6521.. )     92.86%        20.00%       55.17%       1.1607       0.3571
              ( >= .661157 )     92.86%        26.67%       58.62%       1.2662       0.2679
              ( >= .6691.. )     85.71%        26.67%       55.17%       1.1688       0.5357
              ( >= .7166.. )     78.57%        26.67%       51.72%       1.0714       0.8036
              ( >= .7301.. )     78.57%        33.33%       55.17%       1.1786       0.6429
              ( >= .7465.. )     78.57%        40.00%       58.62%       1.3095       0.5357
              ( >= .765625 )     78.57%        46.67%       62.07%       1.4732       0.4592
              ( >= .7874.. )     71.43%        46.67%       58.62%       1.3393       0.6122
              ( >= .8 )          64.29%        46.67%       55.17%       1.2054       0.7653
              ( >= .8016.. )     64.29%        53.33%       58.62%       1.3776       0.6696
              ( >= .824 )        64.29%        60.00%       62.07%       1.6071       0.5952
              ( >= .825 )        57.14%        60.00%       58.62%       1.4286       0.7143
              ( >= .8455.. )     50.00%        60.00%       55.17%       1.2500       0.8333
              ( >= .8496.. )     42.86%        60.00%       51.72%       1.0714       0.9524
              ( >= .8548.. )     42.86%        66.67%       55.17%       1.2857       0.8571
              ( >= .8688.. )     35.71%        66.67%       51.72%       1.0714       0.9643
              ( >= .8761.. )     28.57%        66.67%       48.28%       0.8571       1.0714
              ( >= .8793.. )     28.57%        73.33%       51.72%       1.0714       0.9740
              ( >= .8916.. )     21.43%        73.33%       48.28%       0.8036       1.0714
              ( >= .9158.. )     14.29%        73.33%       44.83%       0.5357       1.1688
              ( >= .9333.. )     14.29%        80.00%       48.28%       0.7143       1.0714
              ( >= .9370.. )     14.29%        86.67%       51.72%       1.0714       0.9890
              ( >= .9619.. )      7.14%        86.67%       48.28%       0.5357       1.0714
              ( >= 1 )            7.14%        93.33%       51.72%       1.0714       0.9949
              ( >= 1.01626 )      7.14%       100.00%       55.17%                    0.9286
              ( >  1.01626 )      0.00%       100.00%       51.72%                    1.0000
              ------------------------------------------------------------------------------
              
              
                                    ROC                     Asymptotic normal  
                         Obs       area     Std. err.      [95% conf. interval]
                   ------------------------------------------------------------
                          29     0.5667       0.1121        0.34691     0.78642
              
              .
              . gen byte shockindex = (shock_index >= 0.8) if !missing(shock_index)
              (2 missing values generated)
              
              . roctab hypo_before_del shockindex, detail
              
              Detailed report of sensitivity and specificity
              ------------------------------------------------------------------------------
                                                         Correctly
              Cutpoint      Sensitivity   Specificity   classified          LR+          LR-
              ------------------------------------------------------------------------------
              ( >= 0 )          100.00%         0.00%       48.28%       1.0000    
              ( >= 1 )           64.29%        46.67%       55.17%       1.2054       0.7653
              ( >  1 )            0.00%       100.00%       51.72%                    1.0000
              ------------------------------------------------------------------------------
              
              
                                    ROC                     Asymptotic normal  
                         Obs       area     Std. err.      [95% conf. interval]
                   ------------------------------------------------------------
                          29     0.5548       0.0941        0.37028     0.73925
              Then you confuse things further by going to logistic regression and -estat classification-. You use those correctly, but what you misunderstand is that the -cutoff()- used by -estat classification- is not the value of shock_index. Rather, it is the probability of hypo_before_del predicted by the logistic regression model. Which is a completely different matter.

              The correct values of sensitivity and specificity associated with using a cutoff of 0.8 value of shock_index are the ones shown above, generated from -roctab-. The values you are getting from -estat classification, cutoff(0.8)- after logistic regression are something else altogether as they are based on a different cutoff.

              Comment


              • #8
                @Clyde Schechter, Thank you for rectifying my mistakes. I did three different techniques to find out the optimal cutoff point (ROC, Liu method, and bootstrap). All techniques showed 0.8016 as an optimal cutoff point with sensitivity and specificity as 0.67 and 0.60 respectively. Also, after converting the shock index variable (based on cutoff) into dichotomous I get the same sensitivity and specificity.
                Then, I did logistic regression, estat classification, and lroc (based on dichotomous shock index variable) and found the following:
                From estat classification (with predicted probability of 0.5)- sensitivity 0.67, sensitivity 0.60, PPV-0.62, NPV-0.64
                Should I stick with these values?

                I came across this video which focused on Optimal Cut-Points for Continuous Predictors to Discriminate Disease Outcomes.
                https://www.youtube.com/watch?v=UnlD0VT1dPQ
                From the video, I learned that there are several methods- odds ratio, ROC, Gini index, kappa statistics, chi-square statistics, Younden index, misclassification.

                My first question is which method for selecting the optimal cutoff point should I report in my manuscript.

                From the lroc command, I get a ROC of 0.63.

                Next, with the command below I get a probability cutoff.

                lsens , gensens(sens_shockindex) genspec(spec_shockindex)

                How do I interpret this graph?
                Regards,
                Asish

                ----------------------- copy starting from the next line -----------------------
                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input byte hypo_before_del double shock_index byte shockindex
                1  1.016260162601626 1
                1  .6691729323308271 0
                0  .7465753424657534 0
                0                  1 1
                1  .8455284552845529 1
                0  .9158878504672897 1
                1 .49264705882352944 0
                1  .9370629370629371 1
                0                 .8 0
                .                  . .
                1               .825 1
                0   .849624060150376 1
                0  .7166666666666667 0
                1  .8548387096774194 1
                0  .5857142857142857 0
                1  .8688524590163934 1
                0  .8016528925619835 0
                0  .6521739130434783 0
                0  .9619047619047619 1
                1  .6611570247933884 0
                1  .9117647058823529 1
                1  .8916666666666667 1
                0           .6015625 0
                1  .8793103448275862 1
                0            .640625 0
                .                  . .
                0  .7301587301587301 0
                1  .7874015748031497 0
                0  .8761061946902655 1
                0  .9333333333333333 1
                1            .765625 0
                1               .824 1
                end
                ------------------ copy up to and including the previous line ------------------

                Listed 32 out of 32 observations

                Code:
                  .  roctab hypo_before_del shock_index, detail
                
                Detailed report of sensitivity and specificity
                ------------------------------------------------------------------------------
                                                           Correctly
                Cutpoint      Sensitivity   Specificity   Classified          LR+          LR-
                ------------------------------------------------------------------------------
                ( >= .4926.. )    100.00%         0.00%       50.00%       1.0000     
                ( >= .5857.. )     93.33%         0.00%       46.67%       0.9333     
                ( >= .6015.. )     93.33%         6.67%       50.00%       1.0000       1.0000
                ( >= .640625 )     93.33%        13.33%       53.33%       1.0769       0.5000
                ( >= .6521.. )     93.33%        20.00%       56.67%       1.1667       0.3333
                ( >= .661157 )     93.33%        26.67%       60.00%       1.2727       0.2500
                ( >= .6691.. )     86.67%        26.67%       56.67%       1.1818       0.5000
                ( >= .7166.. )     80.00%        26.67%       53.33%       1.0909       0.7500
                ( >= .7301.. )     80.00%        33.33%       56.67%       1.2000       0.6000
                ( >= .7465.. )     80.00%        40.00%       60.00%       1.3333       0.5000
                ( >= .765625 )     80.00%        46.67%       63.33%       1.5000       0.4286
                ( >= .7874.. )     73.33%        46.67%       60.00%       1.3750       0.5714
                ( >= .8 )          66.67%        46.67%       56.67%       1.2500       0.7143
                ( >= .8016.. )     66.67%        53.33%       60.00%       1.4286       0.6250
                ( >= .824 )        66.67%        60.00%       63.33%       1.6667       0.5556
                ( >= .825 )        60.00%        60.00%       60.00%       1.5000       0.6667
                ( >= .8455.. )     53.33%        60.00%       56.67%       1.3333       0.7778
                ( >= .8496.. )     46.67%        60.00%       53.33%       1.1667       0.8889
                ( >= .8548.. )     46.67%        66.67%       56.67%       1.4000       0.8000
                ( >= .8688.. )     40.00%        66.67%       53.33%       1.2000       0.9000
                ( >= .8761.. )     33.33%        66.67%       50.00%       1.0000       1.0000
                ( >= .8793.. )     33.33%        73.33%       53.33%       1.2500       0.9091
                ( >= .8916.. )     26.67%        73.33%       50.00%       1.0000       1.0000
                ( >= .9117.. )     20.00%        73.33%       46.67%       0.7500       1.0909
                ( >= .9158.. )     13.33%        73.33%       43.33%       0.5000       1.1818
                ( >= .9333.. )     13.33%        80.00%       46.67%       0.6667       1.0833
                ( >= .9370.. )     13.33%        86.67%       50.00%       1.0000       1.0000
                ( >= .9619.. )      6.67%        86.67%       46.67%       0.5000       1.0769
                ( >= 1 )            6.67%        93.33%       50.00%       1.0000       1.0000
                ( >= 1.01626 )      6.67%       100.00%       53.33%                    0.9333
                ( >  1.01626 )      0.00%       100.00%       50.00%                    1.0000
                ------------------------------------------------------------------------------
                
                
                                      ROC                    -Asymptotic Normal--
                           Obs       Area     Std. Err.      [95% Conf. Interval]
                     ------------------------------------------------------------
                            30     0.5778       0.1105        0.36125     0.79430
                Code:
                     . cutpt hypo_before_del shock_index, noadjust
                
                Empirical cutpoint estimation
                Method:                                Liu
                Reference variable:                    hypo_before_del (0=neg, 1=pos)
                Classification variable:               shock_index
                Empirical optimal cutpoint:            .80165289
                Sensitivity at cutpoint:               0.67
                Specificity at cutpoint:               0.60
                Area under ROC curve at cutpoint:      0.63
                Code:
                   . bootstrap e(cutpoint), rep(100): cutpt hypo_before_del shock_index, noadjust
                (running cutpt on estimation sample)
                
                Bootstrap replications (100)
                ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
                ..................x................x..............    50
                .......x...x............x.........................   100
                
                Bootstrap results                               Number of obs     =         30
                                                                Replications      =         95
                
                      command:  cutpt hypo_before_del shock_index, noadjust
                        _bs_1:  e(cutpoint)
                
                ------------------------------------------------------------------------------
                             |   Observed   Bootstrap                         Normal-based
                             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                       _bs_1 |   .8016529   .0457482    17.52   0.000     .7119881    .8913177
                ------------------------------------------------------------------------------
                Note: One or more parameters could not be estimated in 5 bootstrap replicates;
                      standard-error estimates include only complete replications.
                Code:
                   . roctab hypo_before_del shockindex, detail
                
                Detailed report of sensitivity and specificity
                ------------------------------------------------------------------------------
                                                           Correctly
                Cutpoint      Sensitivity   Specificity   Classified          LR+          LR-
                ------------------------------------------------------------------------------
                ( >= 0 )          100.00%         0.00%       50.00%       1.0000     
                ( >= 1 )           66.67%        60.00%       63.33%       1.6667       0.5556
                ( >  1 )            0.00%       100.00%       50.00%                    1.0000
                ------------------------------------------------------------------------------
                
                
                                      ROC                    -Asymptotic Normal--
                           Obs       Area     Std. Err.      [95% Conf. Interval]
                     ------------------------------------------------------------
                            30     0.6333       0.0909        0.45527     0.81140
                Code:
                 . logistic hypo_before_del shockindex, vce(cluster serialno)
                
                Logistic regression                             Number of obs     =         30
                                                                Wald chi2(1)      =       2.02
                                                                Prob > chi2       =     0.1553
                Log pseudolikelihood = -19.709604               Pseudo R2         =     0.0522
                
                                                 (Std. Err. adjusted for 30 clusters in serialno)
                ---------------------------------------------------------------------------------
                                |               Robust
                hypo_before_del | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
                ----------------+----------------------------------------------------------------
                     shockindex |          3   2.319334     1.42   0.155     .6592463    13.65195
                          _cons |   .5555556   .3151715    -1.04   0.300       .18274    1.688968
                ---------------------------------------------------------------------------------
                Note: _cons estimates baseline odds.
                Code:
                  .  estat classification
                
                Logistic model for hypo_before_del
                
                              -------- True --------
                Classified |         D            ~D  |      Total
                -----------+--------------------------+-----------
                     +     |        10             6  |         16
                     -     |         5             9  |         14
                -----------+--------------------------+-----------
                   Total   |        15            15  |         30
                
                Classified + if predicted Pr(D) >= .5
                True D defined as hypo_before_del != 0
                --------------------------------------------------
                Sensitivity                     Pr( +| D)   66.67%
                Specificity                     Pr( -|~D)   60.00%
                Positive predictive value       Pr( D| +)   62.50%
                Negative predictive value       Pr(~D| -)   64.29%
                --------------------------------------------------
                False + rate for true ~D        Pr( +|~D)   40.00%
                False - rate for true D         Pr( -| D)   33.33%
                False + rate for classified +   Pr(~D| +)   37.50%
                False - rate for classified -   Pr( D| -)   35.71%
                --------------------------------------------------
                Correctly classified                        63.33%
                --------------------------------------------------
                Code:
                 
                . lroc
                
                Logistic model for hypo_before_del
                
                number of observations =       30
                area under ROC curve   =   0.6333
                Code:
                   lsens , gensens(sens_shockindex) genspec(spec_shockindex)

                Attached Files

                Comment


                • #9
                  Then, I did logistic regression, estat classification, and lroc (based on dichotomous shock index variable) and found the following:
                  From estat classification (with predicted probability of 0.5)- sensitivity 0.67, sensitivity 0.60, PPV-0.62, NPV-0.64
                  Should I stick with these values?
                  Yes.

                  From the lroc command, I get a ROC of 0.63.
                  So, this is a kind of mediocre result. The discrimination is a little better than an uniformed guess, but not much.

                  My first question is which method for selecting the optimal cutoff point should I report in my manuscript.
                  As you note, there are many approaches. And you did not even mention that one that I favor in most circumstances: decision theory. There are varying reasons for preferring one way or another in different circumstances. No one is best. You should report whichever one you finally settled on. If several of them led you to the same point, you can mention those as well.

                  How do I interpret this graph?
                  That kind of graph is not useful for a dichotomous predictor. Although it purports to show results for 4 probability cutoffs, in fact it is only showing one, the one near the center of the graph. The others are meaningless with a dichotomous predictor. This kind of graph is really intended for use directly with a continuous predictor. And it is just a different way of showing the information that is contained in the ROC graph. If you are going to show a graph in your report, the ROC graph from the continuous version of your index would be a better choice.

                  Comment


                  • #10
                    I have not seen this done much (if at all) in medical & health related research, but I think it is useful to report the Gini coefficient in addition to the AUC, as it gives the proportion of area under the curve above the diagonal. For Asih's data:

                    Code:
                    . display "GINI = 2*AUC-1 = " 2*0.6333-1
                    GINI = 2*AUC-1 = .2666
                    I once heard someone describe the Gini coefficient as a chance-corrected AUC, and thought that was a pretty good way to think of it.
                    --
                    Bruce Weaver
                    Email: [email protected]
                    Version: Stata/MP 18.5 (Windows)

                    Comment


                    • #11
                      Clyde Schechter, Thank you for enlightening me with the use of decision curve analysis (DCA). As you are aware, I wanted to see whether shock index (dichotomized) is associated with hypotension.
                      From the DCA, I observed the range of threshold probabilities between 33% and 62%. As a result, with an approximate threshold probability of 50%, the probability of hypotension was greater than 50% with the help of the shock index. I hope the interpretation is correct.

                      Code:
                       dca hypo_before_del shockindex
                      Second, as suggested I have presented the probability cutoff using the continuous predictor (shock_index) and found it to be approx. 50%. Is the interpretation the same as we did for DCA?

                      Code:
                         lsens , gensens(sens_shock_index) genspec(spec_shock_index)
                      @ Bruce Weaver, Thank you for your inputs regarding Gini index.

                      regards,
                      Asish Subedi

                      Attached Files

                      Comment


                      • #12
                        Well, the -dca- program is nice, but it has some limitations, and it also requires some care in its use and interpretation. It implicitly assumes that the disutility associated with treating a false positive is the same as the disutility of not treating a false negative. That is not usually the case in reality. On the plus side, it does allow the user to specify a harm associated with the test itself. Whether your shock_index variable can be said to be cost-free and risk-free I do not know, as you haven't really said anything about it. But if it requires some level of risk or cost (say, for example, it requires something other than reviewing existing known attributes of the patient) then some amount of harm should be posited. Also, -dca- allows you to specify the prevalence in the target population for this test. As you did not specify that option, it defaults to assuming that the population prevalence is the same as the prevalence in your data sample. Whether that is appropriate depends on the whether your sample is representative of the population. Again, as you have said nothing about how your sample was accrued, I can't comment more specifically.

                        Finally, I don't think that dichotomous predictors are suitable for use with -dca-. The whole concept behind -dca- is to identify an appropriate cutoff for defining (and acting on) a positive test result from a continuous predictor. A dichotomous predictor has no range of values for offering a choice of threshold.

                        Decision analysis (as opposed to the limited implementation of it in the -dca- software), more generally, overcomes all of these limitations--but it is more complicated to carry out and does not easily reduce itself to a simple software program as many different factors in different contexts may need to be incorporated. I think for your purposes, -dca- is a decent approximation, although, since I don't know what the consequences of

                        Ultimately, the way to read the graphs is to look at the range of threshold probability shown where the Net Benefit treat for the index is higher than both the net benefit of Treat All and the net benefit of Treat None. Using a predicted probability in that range as the threshold to define a positive test result will, under those (usually unrealistic) assumptions about disutility, will produce greater net benefit than not using the test and adopting a policy of either treating everybody or adopting a policy of treating nobody.

                        Comment


                        • #13
                          Having not used -dca- in a while, I decided to re-read the Vickers and Elkins article in Medical Decision Making on which it is based. I realize now that some of what I said in #12. It does not implicitly assume that the disutility of a false negative test is the same as the utility of a false positive. Rather, it assumes that the choice of a particular threshold probability of disease as a trigger for treatment implicitly determines that tradeoff, through the equation (Net Benefit of Treatment of a True Case)/(Net Harm of Unnecessary Treatment) = (1-p)/p, where p is the threshold probability, and they provide the algebraic argument supporting that assumption.
                          Last edited by Clyde Schechter; 30 Dec 2021, 14:50.

                          Comment


                          • #14
                            @Clyde Schechter, shock index is an observed calculated variable; it is cost-free and not an invasive procedure like taking a biopsy. I guess DCA may not be required in my case.

                            Comment


                            • #15
                              Clyde Schechter Asish Subedi Carlo Lazzaro Bruce Weaver Thanks for this thread. Might you have any suggestions how to do a similar analysis (i.e., estat classification with pre-determined cut-offs) but incorporating survey weights? That is, estimating correct classification rate for various cutoff values and incorporating pweights.

                              I have used

                              Code:
                              rocreg y x [pweight=wt], probit ml
                              Code:
                              senspec y x [pweigh=wt], se(sens) sp(spec)
                              list sens spec if x==x1
                              list sens spec if x==x2
                              to calculate overall AUC, and sensitivity and specificity at set cutoff values (x= x1, x2, ...xn). I would also like to calculate the correct classification rate for each of the cutoff values (x1, x2..xn), which I (think I) could do with estat classification if it accepted pweights

                              Code:
                              probit y x [pweight=wt]
                              estat classification, cut(x1)
                              Thanks in advance for any thoughts you might share.
                              Itai

                              Comment

                              Working...
                              X