Sensitivity, Specificity, Positive predictive value, Negative value, Younden Index

Asish Subedi

Join Date: Jul 2021
Posts: 15

Sensitivity, Specificity, Positive predictive value, Negative value, Younden Index

22 Dec 2021, 23:25

Dear Experts,
The objective of my study is to investigate whether preoperative shock index (continuous variable) predicts hypotension (binary outcome). I want to find out sensitivity, specificity, PPV, NPV, and Younden Index. I am using STATA/IC 15.1
thank you.
----------------------- copy starting from the next line -----------------------

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(serialno hypo_before_del) double shock_index
30 1  .9117647058823529
31 0  .6891891891891891
32 0 1.0869565217391304
33 0  .7969924812030075
34 0               .712
35 1  .9322033898305084
36 0   .853448275862069
37 0  .7777777777777778
38 1  .8333333333333334
39 0  .6830985915492958
40 1  .5895522388059702
41 0  1.045045045045045
42 1  .9166666666666666
43 0  .8545454545454545
44 1  .8560606060606061
45 1  .8333333333333334
46 0  .7739130434782608
47 0  .5636363636363636
48 0  .7241379310344828
49 0  .6131386861313869
50 0  .6124031007751938
51 1  .6956521739130435
52 1  .5606060606060606
53 0  .8032786885245902
54 0   .782258064516129
55 1  .9448818897637795
56 0  .6538461538461539
57 1           .6953125
58 0  .8981481481481481
59 0  .6271186440677966
end

------------------ copy up to and including the previous line ------------------

Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17708

23 Dec 2021, 04:49

Asish:
you may want to consider:

Code:

. logistic hypo_before_del shock_index, vce(cluster serialno)

Logistic regression                                     Number of obs =     30
                                                        Wald chi2(1)  =   0.32
                                                        Prob > chi2   = 0.5704
Log pseudolikelihood = -19.529179                       Pseudo R2     = 0.0094

                                 (Std. err. adjusted for 30 clusters in serialno)
---------------------------------------------------------------------------------
                |               Robust
hypo_before_del | Odds ratio   std. err.      z    P>|z|     [95% conf. interval]
----------------+----------------------------------------------------------------
    shock_index |   5.482709   16.44056     0.57   0.570     .0153662    1956.251
          _cons |   .1531845   .3646242    -0.79   0.431     .0014425    16.26767
---------------------------------------------------------------------------------
Note: _cons estimates baseline odds.

. estat classification

Logistic model for hypo_before_del

              -------- True --------
Classified |         D            ~D  |      Total
-----------+--------------------------+-----------
     +     |         0             0  |          0
     -     |        11            19  |         30
-----------+--------------------------+-----------
   Total   |        11            19  |         30

Classified + if predicted Pr(D) >= .5
True D defined as hypo_before_del != 0
--------------------------------------------------
Sensitivity                     Pr( +| D)    0.00%
Specificity                     Pr( -|~D)  100.00%
Positive predictive value       Pr( D| +)       .%
Negative predictive value       Pr(~D| -)   63.33%
--------------------------------------------------
False + rate for true ~D        Pr( +|~D)    0.00%
False - rate for true D         Pr( -| D)  100.00%
False + rate for classified +   Pr(~D| +)       .%
False - rate for classified -   Pr( D| -)   36.67%
--------------------------------------------------
Correctly classified                        63.33%
--------------------------------------------------

.

It's trivial to notice that a single predictor makes your regression misspecified.

Kind regards,
Carlo
(Stata 19.0)

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#3

23 Dec 2021, 09:57

It is not meaningful to speak of sensitivity, specificity, NPV or PPV in the context of a continuous predictor. Those parameters are only meaningful once you pick a cutoff value for the continuous predictor: then you can define the operating characteristics for the dichotomous predictor corresponding to greater than vs less than the cutoff. The -estat classification- command recommended in #2 will, by default, use a cutoff of 0.5 predicted probability. That is seldom useful in real life. -estat classification- does have a -cutoff()- option that allows you to specify that threshold of predicted probability that you want to use. In your context it probably makes sense to first run -lroc- (after the logistic regression) to see a graph of sensitivity vs (1 minus) specificity: this will enable you to identify a range of values for the cutoff that produce reasonable values of sensitivity and specificity. Then you can run -estat classification- a few times with selected cutoffs to get quantitative estimates of those characteristics of the test operated at those cutoffs.

As for the Youden statistic, it is simply the sum of sensitivity and specificity. It is widely used in the medical literature; it is also meaningless and useless in nearly all realistic scenarios.
4 likes
Comment
Asish Subedi

Join Date: Jul 2021

Posts: 15
#4

25 Dec 2021, 06:38

Carlo Lazzaro Clyde Schechter, According to your suggestions I did the following and found .80 as the cutoff and did the subsequent analysis. Please correct me if there are any mistakes.

. dataex hypo_before_del shock_index

----------------------- copy starting from the next line -----------------------

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte hypo_before_del double shock_index 0 .8 0 .640625 1 .49264705882352944 1 .765625 1 .8455284552845529 0 .8016528925619835 1 .8548387096774194 0 .9333333333333333 1 1.016260162601626 0 .9158878504672897 1 .8688524590163934 1 .825 0 .7465753424657534 0 .9619047619047619 1 .7874015748031497 1 .6691729323308271 0 .8761061946902655 1 .9370629370629371 0 .6521739130434783 1 .8793103448275862 1 .824 1 .8916666666666667 0 .6015625 0 .7301587301587301 0 .5857142857142857 0 .7166666666666667 1 .6611570247933884 0 .849624060150376 0 1 . . . . end

------------------ copy up to and including the previous line ------------------

. roctab hypo_before_del shock_index, detail

Detailed report of sensitivity and specificity

Correctly
Cutpoint Sensitivity Specificity Classified LR+ LR-

( >= .4926.. ) 100.00% 0.00% 48.28% 1.0000
( >= .5857.. ) 92.86% 0.00% 44.83% 0.9286
( >= .6015.. ) 92.86% 6.67% 48.28% 0.9949 1.0714
( >= .640625 ) 92.86% 13.33% 51.72% 1.0714 0.5357
( >= .6521.. ) 92.86% 20.00% 55.17% 1.1607 0.3571
( >= .661157 ) 92.86% 26.67% 58.62% 1.2662 0.2679
( >= .6691.. ) 85.71% 26.67% 55.17% 1.1688 0.5357
( >= .7166.. ) 78.57% 26.67% 51.72% 1.0714 0.8036
( >= .7301.. ) 78.57% 33.33% 55.17% 1.1786 0.6429
( >= .7465.. ) 78.57% 40.00% 58.62% 1.3095 0.5357
( >= .765625 ) 78.57% 46.67% 62.07% 1.4732 0.4592
( >= .7874.. ) 71.43% 46.67% 58.62% 1.3393 0.6122
( >= .8 ) 64.29% 46.67% 55.17% 1.2054 0.7653
( >= .8016.. ) 64.29% 53.33% 58.62% 1.3776 0.6696
( >= .824 ) 64.29% 60.00% 62.07% 1.6071 0.5952
( >= .825 ) 57.14% 60.00% 58.62% 1.4286 0.7143
( >= .8455.. ) 50.00% 60.00% 55.17% 1.2500 0.8333
( >= .8496.. ) 42.86% 60.00% 51.72% 1.0714 0.9524
( >= .8548.. ) 42.86% 66.67% 55.17% 1.2857 0.8571
( >= .8688.. ) 35.71% 66.67% 51.72% 1.0714 0.9643
( >= .8761.. ) 28.57% 66.67% 48.28% 0.8571 1.0714
( >= .8793.. ) 28.57% 73.33% 51.72% 1.0714 0.9740
( >= .8916.. ) 21.43% 73.33% 48.28% 0.8036 1.0714
( >= .9158.. ) 14.29% 73.33% 44.83% 0.5357 1.1688
( >= .9333.. ) 14.29% 80.00% 48.28% 0.7143 1.0714
( >= .9370.. ) 14.29% 86.67% 51.72% 1.0714 0.9890
( >= .9619.. ) 7.14% 86.67% 48.28% 0.5357 1.0714
( >= 1 ) 7.14% 93.33% 51.72% 1.0714 0.9949
( >= 1.01626 ) 7.14% 100.00% 55.17% 0.9286
( > 1.01626 ) 0.00% 100.00% 51.72% 1.0000

From these observations, I found .80 as the cutoff
Next, I categorized them

gen shockindex=1

replace shockindex=0 if shock_index<=0.80

then I get the following results

. roctab hypo_before_del shockindex, detail

Detailed report of sensitivity and specificity

Correctly
Cutpoint Sensitivity Specificity Classified LR+ LR-

( >= 0 ) 100.00% 0.00% 48.28% 1.0000
( >= 1 ) 64.29% 53.33% 58.62% 1.3776 0.6696
( > 1 ) 0.00% 100.00% 51.72% 1.0000

ROC -Asymptotic Normal--
Obs Area Std. Err. [95% Conf. Interval]

29 0.5881 0.0941 0.40361 0.77258

Next,

. logistic hypo_before_del shockindex

Logistic regression Number of obs = 29
LR chi2(1) = 0.91
Prob > chi2 = 0.3389
Log likelihood = -19.626647 Pseudo R2 = 0.0228

hypo_before_del Odds Ratio Std. Err. z P>z [95% Conf. Interval]

shockindex 2.057143 1.565279 0.95 0.343 .4630048 9.139941
_cons .625 .3563048 -0.82 0.410 .2044657 1.910467

. estat classification

Logistic model for hypo_before_del

True --------
Classified D ~D Total

9 7 16
5 8 13

Total 14 15 29

Classified + if predicted Pr(D) >= .5
True D defined as hypo_before_del != 0

Sensitivity Pr( + D) 64.29%
Specificity Pr( -~D) 53.33%
Positive predictive value Pr( D +) 56.25%
Negative predictive value Pr(~D -) 61.54%

False + rate for true ~D Pr( +~D) 46.67%
False - rate for true D Pr( - D) 35.71%
False + rate for classified + Pr(~D +) 43.75%
False - rate for classified - Pr( D -) 38.46%

Correctly classified 58.62%

As per your feedback, I used a cutoff of 0.80 as the predicted probability.

. estat classification, cutoff(.80)

Logistic model for hypo_before_del

True --------
Classified D ~D Total

0 0 0
14 15 29

Total 14 15 29

Classified + if predicted Pr(D) >= .8
True D defined as hypo_before_del != 0

Sensitivity Pr( + D) 0.00%
Specificity Pr( -~D) 100.00%
Positive predictive value Pr( D +) .%
Negative predictive value Pr(~D -) 51.72%

False + rate for true ~D Pr( +~D) 0.00%
False - rate for true D Pr( - D) 100.00%
False + rate for classified + Pr(~D +) .%
False - rate for classified - Pr( D -) 48.28%

Correctly classified 51.72%

Now, regarding the estat classification which is the correct one?

. lroc

Logistic model for hypo_before_del

number of observations = 29
area under ROC curve = 0.5881

regards,
Asish
Comment
Asish Subedi

Join Date: Jul 2021

Posts: 15
#5

25 Dec 2021, 07:47

Carlo Lazzaro Clyde Schechter, I have added the output in a better way in the post below.

Last edited by Asish Subedi; 25 Dec 2021, 08:11.
Comment

Asish Subedi

Join Date: Jul 2021
Posts: 15

25 Dec 2021, 08:07

@Carlo Lazzaro @Clyde Schechter, According to your suggestions I did the following and found .80 as the cutoff and did the subsequent analysis. Please correct me if there are any mistakes.

. dataex hypo_before_del shock_index

----------------------- copy starting from the next line -----------------------

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte hypo_before_del double shock_index
0                 .8
0            .640625
1 .49264705882352944
1            .765625
1  .8455284552845529
0  .8016528925619835
1  .8548387096774194
0  .9333333333333333
1  1.016260162601626
0  .9158878504672897
1  .8688524590163934
1               .825
0  .7465753424657534
0  .9619047619047619
1  .7874015748031497
1  .6691729323308271
0  .8761061946902655
1  .9370629370629371
0  .6521739130434783
1  .8793103448275862
1               .824
1  .8916666666666667
0           .6015625
0  .7301587301587301
0  .5857142857142857
0  .7166666666666667
1  .6611570247933884
0   .849624060150376
0                  1
.                  .
.                  .
end

------------------ copy up to and including the previous line ------------------

Code:

   roctab hypo_before_del shock_index, detail

Detailed report of sensitivity and specificity

Correctly
Cutpoint Sensitivity Specificity Classified LR+ LR-

( >= .4926.. ) 100.00% 0.00% 48.28% 1.0000
( >= .5857.. ) 92.86% 0.00% 44.83% 0.9286
( >= .6015.. ) 92.86% 6.67% 48.28% 0.9949 1.0714
( >= .640625 ) 92.86% 13.33% 51.72% 1.0714 0.5357
( >= .6521.. ) 92.86% 20.00% 55.17% 1.1607 0.3571
( >= .661157 ) 92.86% 26.67% 58.62% 1.2662 0.2679
( >= .6691.. ) 85.71% 26.67% 55.17% 1.1688 0.5357
( >= .7166.. ) 78.57% 26.67% 51.72% 1.0714 0.8036
( >= .7301.. ) 78.57% 33.33% 55.17% 1.1786 0.6429
( >= .7465.. ) 78.57% 40.00% 58.62% 1.3095 0.5357
( >= .765625 ) 78.57% 46.67% 62.07% 1.4732 0.4592
( >= .7874.. ) 71.43% 46.67% 58.62% 1.3393 0.6122
( >= .8 ) 64.29% 46.67% 55.17% 1.2054 0.7653
( >= .8016.. ) 64.29% 53.33% 58.62% 1.3776 0.6696
( >= .824 ) 64.29% 60.00% 62.07% 1.6071 0.5952
( >= .825 ) 57.14% 60.00% 58.62% 1.4286 0.7143
( >= .8455.. ) 50.00% 60.00% 55.17% 1.2500 0.8333
( >= .8496.. ) 42.86% 60.00% 51.72% 1.0714 0.9524
( >= .8548.. ) 42.86% 66.67% 55.17% 1.2857 0.8571
( >= .8688.. ) 35.71% 66.67% 51.72% 1.0714 0.9643
( >= .8761.. ) 28.57% 66.67% 48.28% 0.8571 1.0714
( >= .8793.. ) 28.57% 73.33% 51.72% 1.0714 0.9740
( >= .8916.. ) 21.43% 73.33% 48.28% 0.8036 1.0714
( >= .9158.. ) 14.29% 73.33% 44.83% 0.5357 1.1688
( >= .9333.. ) 14.29% 80.00% 48.28% 0.7143 1.0714
( >= .9370.. ) 14.29% 86.67% 51.72% 1.0714 0.9890
( >= .9619.. ) 7.14% 86.67% 48.28% 0.5357 1.0714
( >= 1 ) 7.14% 93.33% 51.72% 1.0714 0.9949
( >= 1.01626 ) 7.14% 100.00% 55.17% 0.9286
( > 1.01626 ) 0.00% 100.00% 51.72% 1.0000

From these observations, I found .80 as the cutoff
Next, I categorized them

gen shockindex=1

replace shockindex=0 if shock_index<=0.80

then I get the following results

Code:

 roctab hypo_before_del shockindex, detail

Detailed report of sensitivity and specificity

Correctly
Cutpoint Sensitivity Specificity Classified LR+ LR-

( >= 0 ) 100.00% 0.00% 48.28% 1.0000
( >= 1 ) 64.29% 53.33% 58.62% 1.3776 0.6696
( > 1 ) 0.00% 100.00% 51.72% 1.0000



ROC -Asymptotic Normal--
Obs Area Std. Err. [95% Conf. Interval]

29 0.5881 0.0941 0.40361 0.77258

Next,

Code:

. logistic hypo_before_del shockindex

Logistic regression Number of obs = 29
LR chi2(1) = 0.91
Prob > chi2 = 0.3389
Log likelihood = -19.626647 Pseudo R2 = 0.0228


hypo_before_del Odds Ratio Std. Err. z P>z [95% Conf. Interval]

shockindex 2.057143 1.565279 0.95 0.343 .4630048 9.139941
_cons .625 .3563048 -0.82 0.410 .2044657 1.910467

Code:

 estat classification

Logistic model for hypo_before_del

True --------
Classified D ~D Total

9 7 16
5 8 13

Total 14 15 29

Classified + if predicted Pr(D) >= .5
True D defined as hypo_before_del != 0

Sensitivity Pr( + D) 64.29%
Specificity Pr( -~D) 53.33%
Positive predictive value Pr( D +) 56.25%
Negative predictive value Pr(~D -) 61.54%

False + rate for true ~D Pr( +~D) 46.67%
False - rate for true D Pr( - D) 35.71%
False + rate for classified + Pr(~D +) 43.75%
False - rate for classified - Pr( D -) 38.46%

Correctly classified 58.62%

As per your feedback, I used a cutoff of 0.80 as the predicted probability.

Code:

 estat classification, cutoff(.80)

Logistic model for hypo_before_del

True --------
Classified D ~D Total

0 0 0
14 15 29

Total 14 15 29

Classified + if predicted Pr(D) >= .8
True D defined as hypo_before_del != 0

Sensitivity Pr( + D) 0.00%
Specificity Pr( -~D) 100.00%
Positive predictive value Pr( D +) .%
Negative predictive value Pr(~D -) 51.72%

False + rate for true ~D Pr( +~D) 0.00%
False - rate for true D Pr( - D) 100.00%
False + rate for classified + Pr(~D +) .%
False - rate for classified - Pr( D -) 48.28%

Correctly classified 51.72%

Now, regarding the estat classification which is the correct one?

Code:

 lroc

Logistic model for hypo_before_del

number of observations = 29
area under ROC curve = 0.5881

regards,
Asish Subedi

Last edited by Asish Subedi; 25 Dec 2021, 08:14.

Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30100

26 Dec 2021, 11:41

You are getting contradictory results because you are confusing two different cutoffs. In your raw data, analyzed with -roctab- the only cutoff that is under consideration is the value of shock_index, which you chose to set at 0.8. Fine.

Then you create a dichotomous variable--but you did it incorrectly. You have missing values for shock_index, and the way you created shockindex, it gets set to 1 when shock_index is missing. If you did it correctly, you would see that the dichotomous variable exactly reproduces the results you get from selecting the 0.8 cutoff in the first -roctab- output:

Code:

. roctab hypo_before_del shock_index, detail

Detailed report of sensitivity and specificity
------------------------------------------------------------------------------
                                           Correctly
Cutpoint      Sensitivity   Specificity   classified          LR+          LR-
------------------------------------------------------------------------------
( >= .4926.. )    100.00%         0.00%       48.28%       1.0000    
( >= .5857.. )     92.86%         0.00%       44.83%       0.9286    
( >= .6015.. )     92.86%         6.67%       48.28%       0.9949       1.0714
( >= .640625 )     92.86%        13.33%       51.72%       1.0714       0.5357
( >= .6521.. )     92.86%        20.00%       55.17%       1.1607       0.3571
( >= .661157 )     92.86%        26.67%       58.62%       1.2662       0.2679
( >= .6691.. )     85.71%        26.67%       55.17%       1.1688       0.5357
( >= .7166.. )     78.57%        26.67%       51.72%       1.0714       0.8036
( >= .7301.. )     78.57%        33.33%       55.17%       1.1786       0.6429
( >= .7465.. )     78.57%        40.00%       58.62%       1.3095       0.5357
( >= .765625 )     78.57%        46.67%       62.07%       1.4732       0.4592
( >= .7874.. )     71.43%        46.67%       58.62%       1.3393       0.6122
( >= .8 )          64.29%        46.67%       55.17%       1.2054       0.7653
( >= .8016.. )     64.29%        53.33%       58.62%       1.3776       0.6696
( >= .824 )        64.29%        60.00%       62.07%       1.6071       0.5952
( >= .825 )        57.14%        60.00%       58.62%       1.4286       0.7143
( >= .8455.. )     50.00%        60.00%       55.17%       1.2500       0.8333
( >= .8496.. )     42.86%        60.00%       51.72%       1.0714       0.9524
( >= .8548.. )     42.86%        66.67%       55.17%       1.2857       0.8571
( >= .8688.. )     35.71%        66.67%       51.72%       1.0714       0.9643
( >= .8761.. )     28.57%        66.67%       48.28%       0.8571       1.0714
( >= .8793.. )     28.57%        73.33%       51.72%       1.0714       0.9740
( >= .8916.. )     21.43%        73.33%       48.28%       0.8036       1.0714
( >= .9158.. )     14.29%        73.33%       44.83%       0.5357       1.1688
( >= .9333.. )     14.29%        80.00%       48.28%       0.7143       1.0714
( >= .9370.. )     14.29%        86.67%       51.72%       1.0714       0.9890
( >= .9619.. )      7.14%        86.67%       48.28%       0.5357       1.0714
( >= 1 )            7.14%        93.33%       51.72%       1.0714       0.9949
( >= 1.01626 )      7.14%       100.00%       55.17%                    0.9286
( >  1.01626 )      0.00%       100.00%       51.72%                    1.0000
------------------------------------------------------------------------------


                      ROC                     Asymptotic normal  
           Obs       area     Std. err.      [95% conf. interval]
     ------------------------------------------------------------
            29     0.5667       0.1121        0.34691     0.78642

.
. gen byte shockindex = (shock_index >= 0.8) if !missing(shock_index)
(2 missing values generated)

. roctab hypo_before_del shockindex, detail

Detailed report of sensitivity and specificity
------------------------------------------------------------------------------
                                           Correctly
Cutpoint      Sensitivity   Specificity   classified          LR+          LR-
------------------------------------------------------------------------------
( >= 0 )          100.00%         0.00%       48.28%       1.0000    
( >= 1 )           64.29%        46.67%       55.17%       1.2054       0.7653
( >  1 )            0.00%       100.00%       51.72%                    1.0000
------------------------------------------------------------------------------


                      ROC                     Asymptotic normal  
           Obs       area     Std. err.      [95% conf. interval]
     ------------------------------------------------------------
            29     0.5548       0.0941        0.37028     0.73925

Then you confuse things further by going to logistic regression and -estat classification-. You use those correctly, but what you misunderstand is that the -cutoff()- used by -estat classification- is not the value of shock_index. Rather, it is the probability of hypo_before_del predicted by the logistic regression model. Which is a completely different matter.

The correct values of sensitivity and specificity associated with using a cutoff of 0.8 value of shock_index are the ones shown above, generated from -roctab-. The values you are getting from -estat classification, cutoff(0.8)- after logistic regression are something else altogether as they are based on a different cutoff.

Comment

Asish Subedi

Join Date: Jul 2021
Posts: 15

27 Dec 2021, 23:57

@Clyde Schechter, Thank you for rectifying my mistakes. I did three different techniques to find out the optimal cutoff point (ROC, Liu method, and bootstrap). All techniques showed 0.8016 as an optimal cutoff point with sensitivity and specificity as 0.67 and 0.60 respectively. Also, after converting the shock index variable (based on cutoff) into dichotomous I get the same sensitivity and specificity.
Then, I did logistic regression, estat classification, and lroc (based on dichotomous shock index variable) and found the following:
From estat classification (with predicted probability of 0.5)- sensitivity 0.67, sensitivity 0.60, PPV-0.62, NPV-0.64
Should I stick with these values?

I came across this video which focused on Optimal Cut-Points for Continuous Predictors to Discriminate Disease Outcomes.
https://www.youtube.com/watch?v=UnlD0VT1dPQ
From the video, I learned that there are several methods- odds ratio, ROC, Gini index, kappa statistics, chi-square statistics, Younden index, misclassification.

My first question is which method for selecting the optimal cutoff point should I report in my manuscript.

From the lroc command, I get a ROC of 0.63.

Next, with the command below I get a probability cutoff.

lsens , gensens(sens_shockindex) genspec(spec_shockindex)

How do I interpret this graph?
Regards,
Asish

----------------------- copy starting from the next line -----------------------

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte hypo_before_del double shock_index byte shockindex
1  1.016260162601626 1
1  .6691729323308271 0
0  .7465753424657534 0
0                  1 1
1  .8455284552845529 1
0  .9158878504672897 1
1 .49264705882352944 0
1  .9370629370629371 1
0                 .8 0
.                  . .
1               .825 1
0   .849624060150376 1
0  .7166666666666667 0
1  .8548387096774194 1
0  .5857142857142857 0
1  .8688524590163934 1
0  .8016528925619835 0
0  .6521739130434783 0
0  .9619047619047619 1
1  .6611570247933884 0
1  .9117647058823529 1
1  .8916666666666667 1
0           .6015625 0
1  .8793103448275862 1
0            .640625 0
.                  . .
0  .7301587301587301 0
1  .7874015748031497 0
0  .8761061946902655 1
0  .9333333333333333 1
1            .765625 0
1               .824 1
end

------------------ copy up to and including the previous line ------------------

Listed 32 out of 32 observations

Code:

  .  roctab hypo_before_del shock_index, detail

Detailed report of sensitivity and specificity
------------------------------------------------------------------------------
                                           Correctly
Cutpoint      Sensitivity   Specificity   Classified          LR+          LR-
------------------------------------------------------------------------------
( >= .4926.. )    100.00%         0.00%       50.00%       1.0000     
( >= .5857.. )     93.33%         0.00%       46.67%       0.9333     
( >= .6015.. )     93.33%         6.67%       50.00%       1.0000       1.0000
( >= .640625 )     93.33%        13.33%       53.33%       1.0769       0.5000
( >= .6521.. )     93.33%        20.00%       56.67%       1.1667       0.3333
( >= .661157 )     93.33%        26.67%       60.00%       1.2727       0.2500
( >= .6691.. )     86.67%        26.67%       56.67%       1.1818       0.5000
( >= .7166.. )     80.00%        26.67%       53.33%       1.0909       0.7500
( >= .7301.. )     80.00%        33.33%       56.67%       1.2000       0.6000
( >= .7465.. )     80.00%        40.00%       60.00%       1.3333       0.5000
( >= .765625 )     80.00%        46.67%       63.33%       1.5000       0.4286
( >= .7874.. )     73.33%        46.67%       60.00%       1.3750       0.5714
( >= .8 )          66.67%        46.67%       56.67%       1.2500       0.7143
( >= .8016.. )     66.67%        53.33%       60.00%       1.4286       0.6250
( >= .824 )        66.67%        60.00%       63.33%       1.6667       0.5556
( >= .825 )        60.00%        60.00%       60.00%       1.5000       0.6667
( >= .8455.. )     53.33%        60.00%       56.67%       1.3333       0.7778
( >= .8496.. )     46.67%        60.00%       53.33%       1.1667       0.8889
( >= .8548.. )     46.67%        66.67%       56.67%       1.4000       0.8000
( >= .8688.. )     40.00%        66.67%       53.33%       1.2000       0.9000
( >= .8761.. )     33.33%        66.67%       50.00%       1.0000       1.0000
( >= .8793.. )     33.33%        73.33%       53.33%       1.2500       0.9091
( >= .8916.. )     26.67%        73.33%       50.00%       1.0000       1.0000
( >= .9117.. )     20.00%        73.33%       46.67%       0.7500       1.0909
( >= .9158.. )     13.33%        73.33%       43.33%       0.5000       1.1818
( >= .9333.. )     13.33%        80.00%       46.67%       0.6667       1.0833
( >= .9370.. )     13.33%        86.67%       50.00%       1.0000       1.0000
( >= .9619.. )      6.67%        86.67%       46.67%       0.5000       1.0769
( >= 1 )            6.67%        93.33%       50.00%       1.0000       1.0000
( >= 1.01626 )      6.67%       100.00%       53.33%                    0.9333
( >  1.01626 )      0.00%       100.00%       50.00%                    1.0000
------------------------------------------------------------------------------


                      ROC                    -Asymptotic Normal--
           Obs       Area     Std. Err.      [95% Conf. Interval]
     ------------------------------------------------------------
            30     0.5778       0.1105        0.36125     0.79430

Code:

     . cutpt hypo_before_del shock_index, noadjust

Empirical cutpoint estimation
Method:                                Liu
Reference variable:                    hypo_before_del (0=neg, 1=pos)
Classification variable:               shock_index
Empirical optimal cutpoint:            .80165289
Sensitivity at cutpoint:               0.67
Specificity at cutpoint:               0.60
Area under ROC curve at cutpoint:      0.63

Code:

   . bootstrap e(cutpoint), rep(100): cutpt hypo_before_del shock_index, noadjust
(running cutpt on estimation sample)

Bootstrap replications (100)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................x................x..............    50
.......x...x............x.........................   100

Bootstrap results                               Number of obs     =         30
                                                Replications      =         95

      command:  cutpt hypo_before_del shock_index, noadjust
        _bs_1:  e(cutpoint)

------------------------------------------------------------------------------
             |   Observed   Bootstrap                         Normal-based
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _bs_1 |   .8016529   .0457482    17.52   0.000     .7119881    .8913177
------------------------------------------------------------------------------
Note: One or more parameters could not be estimated in 5 bootstrap replicates;
      standard-error estimates include only complete replications.

Code:

   . roctab hypo_before_del shockindex, detail

Detailed report of sensitivity and specificity
------------------------------------------------------------------------------
                                           Correctly
Cutpoint      Sensitivity   Specificity   Classified          LR+          LR-
------------------------------------------------------------------------------
( >= 0 )          100.00%         0.00%       50.00%       1.0000     
( >= 1 )           66.67%        60.00%       63.33%       1.6667       0.5556
( >  1 )            0.00%       100.00%       50.00%                    1.0000
------------------------------------------------------------------------------


                      ROC                    -Asymptotic Normal--
           Obs       Area     Std. Err.      [95% Conf. Interval]
     ------------------------------------------------------------
            30     0.6333       0.0909        0.45527     0.81140

Code:

 . logistic hypo_before_del shockindex, vce(cluster serialno)

Logistic regression                             Number of obs     =         30
                                                Wald chi2(1)      =       2.02
                                                Prob > chi2       =     0.1553
Log pseudolikelihood = -19.709604               Pseudo R2         =     0.0522

                                 (Std. Err. adjusted for 30 clusters in serialno)
---------------------------------------------------------------------------------
                |               Robust
hypo_before_del | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
     shockindex |          3   2.319334     1.42   0.155     .6592463    13.65195
          _cons |   .5555556   .3151715    -1.04   0.300       .18274    1.688968
---------------------------------------------------------------------------------
Note: _cons estimates baseline odds.

Code:

  .  estat classification

Logistic model for hypo_before_del

              -------- True --------
Classified |         D            ~D  |      Total
-----------+--------------------------+-----------
     +     |        10             6  |         16
     -     |         5             9  |         14
-----------+--------------------------+-----------
   Total   |        15            15  |         30

Classified + if predicted Pr(D) >= .5
True D defined as hypo_before_del != 0
--------------------------------------------------
Sensitivity                     Pr( +| D)   66.67%
Specificity                     Pr( -|~D)   60.00%
Positive predictive value       Pr( D| +)   62.50%
Negative predictive value       Pr(~D| -)   64.29%
--------------------------------------------------
False + rate for true ~D        Pr( +|~D)   40.00%
False - rate for true D         Pr( -| D)   33.33%
False + rate for classified +   Pr(~D| +)   37.50%
False - rate for classified -   Pr( D| -)   35.71%
--------------------------------------------------
Correctly classified                        63.33%
--------------------------------------------------

Code:

 
. lroc

Logistic model for hypo_before_del

number of observations =       30
area under ROC curve   =   0.6333

Code:

   lsens , gensens(sens_shockindex) genspec(spec_shockindex)

Attached Files

Graph.gph (6.6 KB, 1 view)

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#9

28 Dec 2021, 09:31

Then, I did logistic regression, estat classification, and lroc (based on dichotomous shock index variable) and found the following:
From estat classification (with predicted probability of 0.5)- sensitivity 0.67, sensitivity 0.60, PPV-0.62, NPV-0.64
Should I stick with these values?

Yes.

From the lroc command, I get a ROC of 0.63.

So, this is a kind of mediocre result. The discrimination is a little better than an uniformed guess, but not much.

My first question is which method for selecting the optimal cutoff point should I report in my manuscript.

As you note, there are many approaches. And you did not even mention that one that I favor in most circumstances: decision theory. There are varying reasons for preferring one way or another in different circumstances. No one is best. You should report whichever one you finally settled on. If several of them led you to the same point, you can mention those as well.

How do I interpret this graph?

That kind of graph is not useful for a dichotomous predictor. Although it purports to show results for 4 probability cutoffs, in fact it is only showing one, the one near the center of the graph. The others are meaningless with a dichotomous predictor. This kind of graph is really intended for use directly with a continuous predictor. And it is just a different way of showing the information that is contained in the ROC graph. If you are going to show a graph in your report, the ROC graph from the continuous version of your index would be a better choice.
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1133
#10

28 Dec 2021, 10:12

I have not seen this done much (if at all) in medical & health related research, but I think it is useful to report the Gini coefficient in addition to the AUC, as it gives the proportion of area under the curve above the diagonal. For Asih's data:

Code:

. display "GINI = 2*AUC-1 = " 2*0.6333-1 GINI = 2*AUC-1 = .2666

I once heard someone describe the Gini coefficient as a chance-corrected AUC, and thought that was a pretty good way to think of it.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)
Comment
Asish Subedi

Join Date: Jul 2021

Posts: 15
#11

30 Dec 2021, 03:56

Clyde Schechter, Thank you for enlightening me with the use of decision curve analysis (DCA). As you are aware, I wanted to see whether shock index (dichotomized) is associated with hypotension.
From the DCA, I observed the range of threshold probabilities between 33% and 62%. As a result, with an approximate threshold probability of 50%, the probability of hypotension was greater than 50% with the help of the shock index. I hope the interpretation is correct.

Code:

dca hypo_before_del shockindex

Second, as suggested I have presented the probability cutoff using the continuous predictor (shock_index) and found it to be approx. 50%. Is the interpretation the same as we did for DCA?

Code:

lsens , gensens(sens_shock_index) genspec(spec_shock_index)

@ Bruce Weaver, Thank you for your inputs regarding Gini index.

regards,
Asish Subedi

Attached Files

DCA graph.gph (8.5 KB, 1 view)

probability cutoff.gph (7.0 KB, 1 view)
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#12

30 Dec 2021, 12:13

Well, the -dca- program is nice, but it has some limitations, and it also requires some care in its use and interpretation. It implicitly assumes that the disutility associated with treating a false positive is the same as the disutility of not treating a false negative. That is not usually the case in reality. On the plus side, it does allow the user to specify a harm associated with the test itself. Whether your shock_index variable can be said to be cost-free and risk-free I do not know, as you haven't really said anything about it. But if it requires some level of risk or cost (say, for example, it requires something other than reviewing existing known attributes of the patient) then some amount of harm should be posited. Also, -dca- allows you to specify the prevalence in the target population for this test. As you did not specify that option, it defaults to assuming that the population prevalence is the same as the prevalence in your data sample. Whether that is appropriate depends on the whether your sample is representative of the population. Again, as you have said nothing about how your sample was accrued, I can't comment more specifically.

Finally, I don't think that dichotomous predictors are suitable for use with -dca-. The whole concept behind -dca- is to identify an appropriate cutoff for defining (and acting on) a positive test result from a continuous predictor. A dichotomous predictor has no range of values for offering a choice of threshold.

Decision analysis (as opposed to the limited implementation of it in the -dca- software), more generally, overcomes all of these limitations--but it is more complicated to carry out and does not easily reduce itself to a simple software program as many different factors in different contexts may need to be incorporated. I think for your purposes, -dca- is a decent approximation, although, since I don't know what the consequences of

Ultimately, the way to read the graphs is to look at the range of threshold probability shown where the Net Benefit treat for the index is higher than both the net benefit of Treat All and the net benefit of Treat None. Using a predicted probability in that range as the threshold to define a positive test result will, under those (usually unrealistic) assumptions about disutility, will produce greater net benefit than not using the test and adopting a policy of either treating everybody or adopting a policy of treating nobody.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#13

30 Dec 2021, 14:48

Having not used -dca- in a while, I decided to re-read the Vickers and Elkins article in Medical Decision Making on which it is based. I realize now that some of what I said in #12. It does not implicitly assume that the disutility of a false negative test is the same as the utility of a false positive. Rather, it assumes that the choice of a particular threshold probability of disease as a trigger for treatment implicitly determines that tradeoff, through the equation (Net Benefit of Treatment of a True Case)/(Net Harm of Unnecessary Treatment) = (1-p)/p, where p is the threshold probability, and they provide the algebraic argument supporting that assumption.

Last edited by Clyde Schechter; 30 Dec 2021, 14:50.
1 like
Comment
Asish Subedi

Join Date: Jul 2021

Posts: 15
#14

01 Jan 2022, 02:20

@Clyde Schechter, shock index is an observed calculated variable; it is cost-free and not an invasive procedure like taking a biopsy. I guess DCA may not be required in my case.
Comment
Itai Magodoro

Join Date: Oct 2020

Posts: 26
#15

08 Sep 2022, 15:10

Clyde Schechter Asish Subedi Carlo Lazzaro Bruce Weaver Thanks for this thread. Might you have any suggestions how to do a similar analysis (i.e., estat classification with pre-determined cut-offs) but incorporating survey weights? That is, estimating correct classification rate for various cutoff values and incorporating pweights.

I have used

Code:

rocreg y x [pweight=wt], probit ml

Code:

senspec y x [pweigh=wt], se(sens) sp(spec) list sens spec if x==x1 list sens spec if x==x2

to calculate overall AUC, and sensitivity and specificity at set cutoff values (x= x1, x2, ...xn). I would also like to calculate the correct classification rate for each of the cutoff values (x1, x2..xn), which I (think I) could do with estat classification if it accepted pweights

Code:

probit y x [pweight=wt] estat classification, cut(x1)

Thanks in advance for any thoughts you might share.
Itai
Comment

Announcement

Sensitivity, Specificity, Positive predictive value, Negative value, Younden Index

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment