Development of a clinical nomogram

Francesco Ditonno

Join Date: Dec 2022
Posts: 11

Development of a clinical nomogram

29 Jan 2024, 11:36

Dear Statalist members,

I am currently working on a dataset comprising over 1000 patients to develop a clinical nomogram to predict muscle invasiveness (binary outcome: yes/no) at final histology following a specific surgical procedure. The study population involves patients undergoing the same surgical procedure within a specified risk group.

My objective is to develop a nomogram by comparing different predictive models based on various variables, primarily categorical. The selection criterion for the model is the one with the best AUC and decision curve.

I encountered a syntax error in the decision curve analysis that I'm struggling to resolve. I would like the final graph of decision curve analyis to contain the curve of each predictive model. I suspect the issue arises from the absence of the variable I intend to use in my original dataset, and I'm unsure how to correctly generate it as it needs to contain predictions of each model in terms of probability.

Below is a sample dataset generated with dataex, followed by the code I'm using for the dataset. After the syntax error, I've had to replace the actual variable names with generic ones because I still don't know which will be the best model that will be used to develop the nomogram.

I would be grateful if anyone could find a solution to this and eventually complete the code after the "* Calculate the increase net benefit with different cut-off (5% increase) of the predictive model with the best AUC and net benefit" line with correct varnames. I will also need to correct the DCA for overfitting, may you please add the command for that to the code?

Your assistance is highly appreciated. Thank you in advance!

Code:

 * Example generated by -dataex-.
For more info, type help dataex clear
input float muscle_invasive byte grade_bio float(clinicalt_high size_tumor_high_EAU size_tumor_high_NCCN) byte(preop_cyto_result multifocal) float previous_cystectomy byte variant_histology
1 1 0 . . . 1 0 . 0 1 1 0 0 . 0 0 0 0 0 0 1 1 . 0 0 0 0 1 0 0 0 . 0 0 0 0 1 0 1 1 1 0 0 0 0 1 0 . . . 0 0 . 0 1 1 1 1 0 0 0 . 0 1 1 1 1 . 0 0 0 0 1 0 1 1 . 0 0 0 0 1 0 1 1 1 0 0 0 1 1 0 1 1 . 1 0 . 1 1 1 1 1 1 0 0 0 0 1 0 1 1 . 0 0 0 1 1 1 1 1 1 0 0 0 0 1 1 1 1 3 0 . 0 0 0 0 1 1 0 1 0 0 0 0 1 0 1 . 0 0 0 1 1 0 1 1 . 1 1 0 1 1 1 1 1 . 0 0 0 1 1 1 1 1 . 0 0 0 1 1 0 1 1 . 0 0 0 1 1 0 1 1 . 0 0 0 0 1 0 1 1 . 1 0 0 1 1 0 . . . 1 0 0 1 1 1 0 0 1 0 0 0 1 1 0 1 1 1 0 0 0 0 1 0 . . . 1 0 0 0 0 0 1 1 . 0 0 0 0 1 1 1 1 . 1 0 . 0 0 1 1 1 1 1 0 0 1 1 0 . . 1 0 0 0 1 1 1 0 0 . 0 0 0 1 1 0 . . . 0 0 0 0 1 1 . . . 1 0 0 1 1 0 1 1 . 0 0 0 1 1 0 . . . 0 0 0 1 0 0 0 1 . 0 0 0 1 1 0 1 1 . 0 0 0 1 1 1 . . . 0 0 0 1 1 0 1 1 1 0 0 0 0 1 0 . . 2 0 0 0 0 0 0 1 1 0 0 0 0 1 1 0 . . . 0 0 0 0 0 1 . . . 0 0 0 0 0 0 . . 2 1 0 0 0 1 0 . . 2 0 0 0 1 1 1 1 1 1 1 0 0 1 1 0 0 1 1 0 0 0 1 1 1 . . 2 0 0 . 1 0 1 . . . 0 0 0 0 1 0 0 0 . 0 0 0 1 1 0 1 1 1 0 0 0 1 1 1 1 1 . 0 0 0 0 1 0 1 1 1 0 0 0 1 1 1 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 0 1 1 0 1 0 0 1 1 0 1 1 3 0 0 0 0 1 0 . . 1 0 0 0 0 1 0 . . 1 0 0 0 1 1 0 1 1 1 1 0 0 0 0 1 . . . 0 0 0 0 0 0 0 1 . 0 0 0 1 1 0 . . . 0 0 0 1 1 0 1 1 . 0 0 0 0 0 0 1 1 0 1 0 0 1 1 1 1 1 0 0 0 . 0 1 0 1 1 2 0 0 0 0 1 0 . . 1 0 0 0 1 1 0 0 0 2 0 0 0 0 0 0 1 1 2 0 0 0 1 0 0 1 1 . 0 0 0 1 1 0 1 1 1 1 0 0 1 1 0 0 0 1 0 0 0 1 1 1 1 1 . 0 1 0 1 1 0 1 1 2 0 0 . 1 1 1 1 1 1 0 0 0 1 1 1 1 1 . 0 0 0 0 0 0 1 1 2 0 0 0 1 1 1 1 1 . 0 1 0 0 1 0 1 1 1 0 0 0 1 1 1 1 1 . 0 0 0 1 0 0 0 0 . 1 0 0 1 0 0 . . 1 1 0 0 0 1 0 1 1 . 1 0 0 0 1 0 . . . 0 0 0 1 1 1 1 1 . 1 0 0 1 1 1 1 1 . 0 0 0 1 0 1 1 1 2 0 0 0 1 1 1 1 1 . 0 0 0 1 1 0 1 1 . 0 0 0 1 1 0 1 1 . 0 0 0 1 1 1 1 1 . 0 0 0 1 1 1 1 1 1 0 0 0 1 1 0 0 1 . 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 0 1 1 0 . . . 0 0 0 1 1 0 1 1 2 0 0 0 1 1 0 0 0 . 0 0 0
end
label values grade_bio grade_bio_ label def grade_bio_ 0 "Low Grade", modify label def grade_bio_ 1 "High Grade", modify label values preop_cyto_result preop_cyto_result_ label def preop_cyto_result_ 0 "Negative", modify label def preop_cyto_result_ 1 "Positive", modify label def preop_cyto_result_ 2 "Atypia/Suspicious", modify label def preop_cyto_result_ 3 "Not diagnostic", modify label values multifocal multifocal_ label def multifocal_ 0 "No", modify label def multifocal_ 1 "Yes", modify

These are the commands I used:

Code:

 *population setting  
keep if grade_bio==1 | clinicalt_high== 1 | size_tumor_high_EAU==1 | size_tumor_high_NCCN==1 | preop_cyto_result==1 | multifocal==1 | previous_cystectomy==1 |variant_histology==1  keep if type_surg==2  drop if pt_path==.  
//PREDICTIVE MODELS  
*univariate analysis  
logistic muscle_invasive grade_bio
logistic muscle_invasive clinicalt_high
logistic muscle_invasive size_tumor_high_EAU  
logistic muscle_invasive size_tumor_high_NCCN  
logistic muscle_invasive preop_cyto_result  
logistic muscle_invasive multifocal  
logistic muscle_invasive previous_cystectomy  
logistic muscle_invasive variant_histology  
*multivariate analysis  
//eventuale aggiunta di lsens per calcolo sensibilità e specificità  
//clinical model (based on variables only obtainable at CT and anamnestic evaluation)  
logistic muscle_invasive clinicalt_high size_tumor_high_EAU multifocal previous_cystectomy preop_cyto_result, coef  
lroc  
looclass muscle_invasive clinicalt_high size_tumor_high_EAU multifocal previous_cystectomy preop_cyto_result, model(logit) fig  
capture drop clinical_model_EAU_prediction  
predict clinical_model_EAU_prediction  
label variable clinical_model_EAU_prediction "Clinical model EAU"  

logistic muscle_invasive clinicalt_high size_tumor_high_NCCN multifocal previous_cystectomy preop_cyto_result, coef  
lroc  
looclass muscle_invasive clinicalt_high size_tumor_high_NCCN multifocal previous_cystectomy preop_cyto_result, model(logit) fig  

capture drop clinical_model_NCCN_prediction  
predict clinical_model_NCCN_prediction  
label variable clinical_model_NCCN_prediction "Clinical model NCCN"  

//endoscopic model (based on variables only verifiable after URS)  

logistic muscle_invasive grade_bio variant_histology, coef  lroc  looclass muscle_invasive grade_bio variant_histology, model(logit) fig  
capture drop endoscopic_model_prediction  
predict endoscopic_model_prediction  
label variable endoscopic_model_prediction "Endoscopic model"  

//tumor-related model (based only on tumor features)  

logistic muscle_invasive grade_bio clinicalt_high size_tumor_high_EAU multifocal variant_histology  
lroc  
looclass muscle_invasive grade_bio clinicalt_high size_tumor_high_EAU multifocal variant_histology, model(logit) fig

capture drop tumor_model_EAU_prediction  
predict tumor_model_EAU_prediction  
label variable tumor_model_EAU_prediction "Tumor model EAU"

logistic muscle_invasive grade_bio clinicalt_high size_tumor_high_NCCN multifocal variant_histology  
lroc  
looclass muscle_invasive grade_bio clinicalt_high size_tumor_high_NCCN multifocal variant_histology, model(logit) fig  

capture drop tumor_model_NCCN_prediction  
predict tumor_model_NCCN_prediction  
label variable tumor_model_NCCN_prediction "Tumor model NCCN"  

//staging model (based only on clinical tumor grade and stage, which are the strongest predictors of worse prognosis)  

logistic muscle_invasive grade_bio clinicalt_high  
lroc
looclass muscle_invasive grade_bio clinicalt_high, model(logit) fig  

capture drop staging_model_prediction  
predict staging_model_prediction  
label variable staging_model_prediction "Staging model"  

*Run the decision curve with dca command (https://www.danieldsjoberg.com/dca-t...ial-stata.html) and save out net benefit  

dca muscle_invasive clinical_model_EAU_prediction clinical_model_NCCN_prediction endoscopic_model_prediction tumor_model_EAU_prediction tumor_model_NCCN_prediction staging_model_prediction, xstart(0.05) xstop(0.35) xlabel(0(0.01)0.35) smooth
/// saving("DCA Output marker.dta", replace)  

*nomogram visual description is executed on the predictive model with the best AUC and net benefit  

nomolog  

* Calculate the increase net benefit with different cut-off (5% increase) of the predictive model with the best AUC and net benefit

use "DCA Output marker.dta", clear
g advantage = model - all

label var advantage "Increase in net benefit from using Marker model"  *Calculate the interventions avoided of the predictive model with the best AUC and net benefit  dca muscle_invasive model, prob(no) intervention xstart(0.05) xstop(0.35)

Thank you in advance
Francesco Ditonno

Last edited by Francesco Ditonno; 29 Jan 2024, 11:44.

Tags: None

Announcement

Development of a clinical nomogram