Hello all,
I'm working on a project evaluating several biomarkers to predict pre-eclampsia (high blood pressure that occcurs in pregnancy)
The basic framework of this part of the project is to add a biomarker to a clinical model of some common clinical variables.
The code for the analysis is here.
The loop is to loop through the seven biomarkers I am investigating.
The logit model has PET as a binary outcome and uses each biomarker with age, BMI, race, and clinic mean arterial pressure as additional variables in the model
Then estat to generate the sensitivity / specificity.
The problem as you might be able to deduce is that I want to generate confidence intervals for the various test characteristics (sensitivity, specificity, positive predictive value (ppv) and negative predictive value (npv) )
I eventually read some old posts, which for some reason I cannot find, which suggested bootstrapping as a possible solution. After muddling through I was able to write a simple program to bootstrap these estimates.
So, all well and good so far except that the confidence intervals generated by the bootstrapping have an upper bound that surpasses 100% in some circumstances, which of course would be theoretically impossible. Below is an example of output:
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sen | 78.57143 12.72385 6.18 0.000 53.63315 103.5097
spe | 98.4375 .9245184 106.47 0.000 96.62548 100.2495
ppv | 84.61538 8.047613 10.51 0.000 68.84235 100.3884
npv | 97.67442 1.354036 72.14 0.000 95.02056 100.3283
------------------------------------------------------------------------------
Here is a simplified sample of the data. Included are three of the biomarkes under investigation. I have seven biomarkers in total but have provided three for simplicity
So I have taken the matter as far as I can on my own. My question essentially is whether there is a better way to calculate the confidence intervals for sensitivity/specificity/ppv/npv?
I suppose I could just add more reps to the bootstrapping and hope the confidence intervals shrink enough to be plausible, but I feel like there must be a more sophisticated way to do this properly.
Thanks all.
Christopher Labos
I'm working on a project evaluating several biomarkers to predict pre-eclampsia (high blood pressure that occcurs in pregnancy)
The basic framework of this part of the project is to add a biomarker to a clinical model of some common clinical variables.
The code for the analysis is here.
The loop is to loop through the seven biomarkers I am investigating.
The logit model has PET as a binary outcome and uses each biomarker with age, BMI, race, and clinic mean arterial pressure as additional variables in the model
Then estat to generate the sensitivity / specificity.
Code:
[ foreach var of varlist bin_* { qui logit PET `var' age bmi_pre_preg race_bin clinicMAP estat classification, cutoff(0.5) /*can make predicted probability greater than 50%)*/ }
I eventually read some old posts, which for some reason I cannot find, which suggested bootstrapping as a possible solution. After muddling through I was able to write a simple program to bootstrap these estimates.
Code:
*my program for bootstrapping confidence intervals capture program drop cutci program define cutci version 13.0 args varname logit PET `varname' age bmi_pre_preg race_bin clinicMAP estat classification, cutoff(0.5) end foreach var of varlist bin_* { bootstrap sen = r(P_p1) spe = r(P_n0) ppv =r(P_1p) npv = r(P_0n) , reps(100) seed(12345): cutci `var' }
So, all well and good so far except that the confidence intervals generated by the bootstrapping have an upper bound that surpasses 100% in some circumstances, which of course would be theoretically impossible. Below is an example of output:
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sen | 78.57143 12.72385 6.18 0.000 53.63315 103.5097
spe | 98.4375 .9245184 106.47 0.000 96.62548 100.2495
ppv | 84.61538 8.047613 10.51 0.000 68.84235 100.3884
npv | 97.67442 1.354036 72.14 0.000 95.02056 100.3283
------------------------------------------------------------------------------
Here is a simplified sample of the data. Included are three of the biomarkes under investigation. I have seven biomarkers in total but have provided three for simplicity
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input byte(PET age) double bmi_pre_preg byte race_bin double clinicMAP float(bin_Follistatin bin_GlyFibronectingmL bin_InhibinApgmL) 1 35 31.47391933 0 110.33333333333333 0 0 1 1 45 27.76709812 1 101 0 1 0 0 40 27.00513097 1 99.33333333333333 1 0 0 1 34 28.84153181 1 97.66666666666667 1 1 1 0 43 29.99671269 1 96.33333333333333 1 0 0 0 25 21.58226077 0 103.66666666666666 0 1 0 1 30 58.51490993 0 102.16666666666666 1 0 0 1 41 35.15625 1 95.33333333333333 1 1 1 1 30 26.57103476 0 102.91666666666667 1 1 1 1 42 24.40729546 1 92.8888889 1 1 1 1 48 23.0583271 0 99 1 1 1 0 39 54.55568942 0 98 0 0 0 0 36 19.65983459 0 99 0 0 0 1 29 25.15589569 0 98.75 1 1 1 1 36 22.86659805 1 91.16666666666667 1 0 1 0 42 28.28282828 0 97.33333333333333 0 0 0 1 39 33.25488986 0 97 1 0 1 1 35 27.21730295 1 89.33333333333333 0 1 0 0 39 30.50508507 1 88.83333333333333 0 0 1 0 38 23.87511478 1 88.66666666666667 0 0 1 0 41 23.26333168 0 96 1 0 0 0 44 29.58981476 0 95.66666666666667 1 0 0 0 39 26.7755102 0 95.55555554666665 1 0 0 0 31 22.81949008 0 95.66666666666666 0 1 0 0 31 30.83653053 0 95.16666666666667 0 0 1 0 40 31.08109659 1 87.66666666666667 1 0 0 0 28 35.11682865 0 94 1 0 0 0 32 24.27618286 1 86.66666666666666 1 0 0 0 42 39.05273621 0 93.44444445333333 1 0 0 0 42 24.1671624 1 86 0 0 0 0 39 24.79963244 0 93.33333333333333 1 0 0 0 38 38.70975484 0 92.66666666666667 0 1 0 0 33 40.2381133 0 92 0 0 0 0 40 27.88518739 0 92.33333333333333 0 0 0 0 31 20.80087514 0 92.33333333333333 0 0 0 0 40 24.91349481 0 92.16666666666667 1 1 0 0 41 25.2640542 1 84.66666666666667 1 0 1 0 37 23.63281255 1 84.66666666666667 0 0 0 0 29 27.43507725 0 91.66666666666667 0 0 1 0 26 32.69537789 0 91.33333333333333 1 1 0 0 33 25.99591237 0 90.66666666666667 0 0 1 0 32 23.73866213 0 90.66666666666667 0 1 1 0 38 22.96176738 0 90.58333333333333 0 0 0 0 34 32.10720958 0 90 1 0 0 0 32 21.11268513 0 90.33333333333333 1 0 0 0 39 26.61666922 0 90 0 1 0 0 42 24.7480492 0 90 1 1 1 0 35 25.23692448 0 90 0 0 0 0 44 30.24199597 0 89.66666666666667 0 1 0 0 38 27.54469581 1 82.16666666666667 0 1 0 0 33 25.71166208 1 82.16666666666667 0 1 0 0 39 27.24614824 0 89.5 0 0 0 0 35 21.67125803 0 89.66666666666667 1 1 0 0 39 24.19159307 0 89.33333333333333 1 0 1 1 36 24.12901516 0 89.33333333333333 1 0 0 0 33 48.75725132 1 81 1 0 0 0 33 19.605192 0 89.22222223333333 0 1 1 0 30 32.02036958 0 88.83333333333333 1 0 0 0 35 25.63115586 0 88.66666666666666 0 0 0 0 34 23.56742752 0 88.66666666666667 0 1 1 0 42 23.01573179 0 88.33333333333333 1 0 0 0 34 27.46365545 0 88.22222221333334 1 0 0 0 33 23.61830085 0 88.33333333333333 1 0 0 1 37 27.87845515 0 88 1 0 0 0 39 30.77870114 0 87.8 0 1 0 0 35 23.32341806 0 88 0 0 0 0 35 30.11940191 0 86.99999998666667 0 0 0 0 33 32.44936521 0 86.77777778666666 0 0 0 0 36 42.43663439 0 86.33333333333333 0 1 0 0 40 29.74972106 0 86.66666666666666 0 0 0 0 39 22.89281998 0 86.66666666666667 0 0 0 0 43 20.2020202 0 86.66666666666667 0 0 1 0 37 33.28140022 0 86.16666666666667 0 0 0 0 38 33.53067931 0 86 0 0 0 0 42 38.32001657 1 78.33333334666666 0 0 0 0 37 23.00081144 0 86.16666666666667 0 0 0 0 39 22.75830678 0 86.16666666666667 1 0 0 0 38 21.15931349 0 86 0 0 1 0 38 22.72451072 0 85.8888889 0 0 0 0 36 27.32260031 0 85.66666666666667 0 0 0 0 36 24.88893776 0 85.22222221333334 1 0 0 0 35 29.29456582 0 84.33333333333333 0 0 0 0 41 24.09297052 0 84.44444443333333 0 1 0 0 38 32.88888889 1 76.66666666666667 1 1 0 0 39 21.03806228 0 84.33333333333333 1 1 0 0 34 21.6591398 0 84.1111111 0 0 0 end label values race_bin race_bin label def race_bin 0 "White or other", modify label def race_bin 1 "Black", modify
I suppose I could just add more reps to the bootstrapping and hope the confidence intervals shrink enough to be plausible, but I feel like there must be a more sophisticated way to do this properly.
Thanks all.
Christopher Labos
Comment