Dear Statalist members,
As a part of a nomogram development, I've performed a logistic regression analysis to evaluate the diagnostic performance of a predictive model predicting muscle invasiveness at final pathology.
This is the code I've used
And then I evaluated the net benefit at certain cut-off points using the command dca as follows:
I have a few unresolved issues about this:
- Can I modify the cut-offs at which roctab, detail calculates specificity and sensitivity so they match with the cut-offs I adopted to calculate the net benefit "dca muscle_invasive TumorModelEAU, prob(no) xstart(0.1) xstop(0.6) xby(0.05) nograph"
- How can I calculate the number of patients in which muscle-invasiveness would be missed at each cut-off? Is this by any chance the opposite of "Correctly classified"?
- How can I calculate positive predictive value and negative predictive value at each cut-off point and at the optimal cut-off for the predictive model?
Your help would be greatly appreciated.
Thank you,
Francesco
As a part of a nomogram development, I've performed a logistic regression analysis to evaluate the diagnostic performance of a predictive model predicting muscle invasiveness at final pathology.
This is the code I've used
Code:
//EAU tumor-related model (based only on tumor features) . . logistic muscle_invasive grade_bio clinicalt_high size_tumor_high_EAU multifocal variant_histology note: variant_histology != 0 predicts success perfectly; variant_histology omitted and 5 obs not used. Logistic regression Number of obs = 419 LR chi2(4) = 70.77 Prob > chi2 = 0.0000 Log likelihood = -254.41337 Pseudo R2 = 0.1221 ------------------------------------------------------------------------------------- muscle_invasive | Odds ratio Std. err. z P>|z| [95% conf. interval] --------------------+---------------------------------------------------------------- grade_bio | 4.532146 1.109442 6.17 0.000 2.805013 7.322727 clinicalt_high | 3.461076 .8808343 4.88 0.000 2.101758 5.699537 size_tumor_high_EAU | 1.054306 .2442513 0.23 0.819 .6695275 1.660216 multifocal | .790829 .2003062 -0.93 0.354 .4813764 1.299213 variant_histology | 1 (omitted) _cons | .2788742 .0778792 -4.57 0.000 .1613241 .4820781 ------------------------------------------------------------------------------------- Note: _cons estimates baseline odds. . . lroc Logistic model for muscle_invasive Number of observations = 419 Area under ROC curve = 0.7150 . . looclass muscle_invasive grade_bio clinicalt_high size_tumor_high_EAU multifocal variant_histology, model(logit) fig Iterating across (424) observations ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 .................................................. 150 .................................................. 200 .................................................. 250 .................................................. 300 .................................................. 350 .................................................. 400 ........................ Classification Table for Full Data: -------- True -------- Classified | D ~D | Total -----------+--------------------------+----------- + | 194 106 | 300 - | 32 92 | 124 -----------+--------------------------+----------- Total | 226 198 | 424 Classification Table for Test Data: -------- True -------- Classified | D ~D | Total -----------+--------------------------+----------- + | 187 121 | 308 - | 39 77 | 116 -----------+--------------------------+----------- Total | 226 198 | 424 Classified + if predicted Pr(D) >= .5 True D defined as != 0 Full Test ---------------------------------------------------------------- Sensitivity Pr( +| D) 85.84% 82.74% Specificity Pr( -|~D) 46.46% 38.89% Positive predictive value Pr( D| +) 64.67% 60.71% Negative predictive value Pr(~D| -) 74.19% 66.38% ---------------------------------------------------------------- False + rate for true ~D Pr( +|~D) 53.54% 61.11% False - rate for true D Pr( -| D) 14.16% 17.26% False + rate for classified + Pr(~D| +) 35.33% 39.29% False - rate for classified - Pr( D| -) 25.81% 33.62% ---------------------------------------------------------------- Correctly classified 67.45% 62.26% ---------------------------------------------------------------- ROC area 0.7150 0.6425 ---------------------------------------------------------------- p-value for Full vs Test ROC areas 0.0000 ---------------------------------------------------------------- . . capture drop TumorModelEAU . . predict TumorModelEAU (option pr assumed; Pr(muscle_invasive)) (1,139 missing values generated) . roctab muscle_invasive TumorModelEAU, detail Detailed report of sensitivity and specificity ------------------------------------------------------------------------------ Correctly Cutpoint Sensitivity Specificity classified LR+ LR- ------------------------------------------------------------------------------ ( >= .1799.. ) 100.00% 0.00% 52.74% 1.0000 ( >= .1869.. ) 98.19% 3.03% 53.22% 1.0126 0.5973 ( >= .215941 ) 97.74% 8.59% 55.61% 1.0692 0.2635 ( >= .2239.. ) 95.48% 13.13% 56.56% 1.0991 0.3446 ( >= .4261.. ) 90.50% 36.87% 65.16% 1.4335 0.2577 ( >= .4376.. ) 90.50% 37.37% 65.39% 1.4450 0.2542 ( >= .482455 ) 90.05% 38.89% 65.87% 1.4735 0.2560 ( >= .4941.. ) 87.78% 40.91% 65.63% 1.4856 0.2986 ( >= .5034.. ) 84.62% 44.44% 65.63% 1.5231 0.3462 ( >= .5151.. ) 82.35% 50.00% 67.06% 1.6471 0.3529 ( >= .5599.. ) 73.76% 56.06% 65.39% 1.6786 0.4681 ( >= .5715.. ) 58.82% 68.69% 63.48% 1.8786 0.5995 ( >= .7743.. ) 32.58% 91.92% 60.62% 4.0317 0.7335 ( >= .7824.. ) 32.13% 92.93% 60.86% 4.5436 0.7304 ( >= .811598 ) 23.98% 94.44% 57.28% 4.3167 0.8049 ( >= .8186.. ) 19.46% 95.96% 55.61% 4.8156 0.8393 ( > .8186.. ) 0.00% 100.00% 47.26% 1.0000 ------------------------------------------------------------------------------ ROC Asymptotic normal Obs area Std. err. [95% conf. interval] ------------------------------------------------------------ 419 0.7141 0.0246 0.66587 0.76228 . end of do-file . do "/var/folders/dk/cnyy1w7957ddykv9stjvy6gr0000gn/T//SD03496.000000" . cutpt muscle_invasive TumorModelEAU, noadjust Empirical cutpoint estimation Method: Liu Reference variable: muscle_invasive (0=neg, 1=pos) Classification variable: TumorModelEAU Empirical optimal cutpoint: .5151754 Sensitivity at cutpoint: 0.74 Specificity at cutpoint: 0.56 Area under ROC curve at cutpoint: 0.65 . end of do-file
Code:
dca muscle_invasive TumorModelEAU, prob(no) xstart(0.1) xstop(0.6) xby(0.05) nograph /// saving("DCA Tumor Model EAU.dta", replace) use "DCA Tumor Model EAU.dta", clear g advantage = TumorModelEAU - all
- Can I modify the cut-offs at which roctab, detail calculates specificity and sensitivity so they match with the cut-offs I adopted to calculate the net benefit "dca muscle_invasive TumorModelEAU, prob(no) xstart(0.1) xstop(0.6) xby(0.05) nograph"
- How can I calculate the number of patients in which muscle-invasiveness would be missed at each cut-off? Is this by any chance the opposite of "Correctly classified"?
- How can I calculate positive predictive value and negative predictive value at each cut-off point and at the optimal cut-off for the predictive model?
Your help would be greatly appreciated.
Thank you,
Francesco