Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Internal validation of a Fine-and-Gray model, stcrprep by P C Lambert, calibration and discrimination

    Dear Statalisters!

    I'm conducting a model development study, in which I'm trying to predict recurrence of endometrial cancer in the presence of death (from any cause) prior to recurrence as a competing risk. I had initially over 80 candidate predictors (the data is a follow-up from a previous study with another aim). I performed event-per-parameter calculations using pmsampsize, and based on that (and subject matter knowledge) I've selected a handful of candidate predictors.

    Since I'm more interested in prediction, rather than etiology, I've opted for a Fine-and-Gray model (instead of cause-specific modelling)
    "Lau et al suggest that the cause‐specific hazard model is more appropriate for addressing etiological questions, while the Fine‐Gray model is more appropriate for addressing questions around incidence and prognosis. Both we and Wolbers et al have echoed this assertion.4, 15 Thus, if the research objective is to derive a model for predicting the probability of the occurrence of outcomes over time, then a subdistribution hazard model would be appropriate.16, 17"
    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5698744/

    Since I have some continuous candidate predictors, I've also opted for a MFPA-algoritm to obtain fractional polynominals and simultaneously do backwards elimination of redundant candidate predictors.
    "MFP is an approach to multivariable model-building which retains continuous predictors as continuous, finds non-linear functions if sufficiently supported by the data, and removes weakly influential predictors by backward elimination (BE)."
    https://mfp.imbi.uni-freiburg.de/mfp and https://pubmed.ncbi.nlm.nih.gov/29398977/

    Now, here are my problem(s).

    The MFPA-algorithm is very computationally intensive. In my full dataset (1429 obs.), one run takes about 7 minutes, which would be fine if it weren't for the need of later assessing internal validation. For internal validation, with bootstrap as an example, the entire model development process must be repeated in each bootstrap sample, and the number of bootstrap samples should be high (100 to 1000 possibly). So this part would take days! Even with fewer resampling datasets, say k-fold cross validation, it will take hours. Only a single random split would be computationally efficient, but that is not recommended due to low power (and as I described above, the number of candidate predictors were based on an EPP-calculation which assumed that the full dataset would be available for model development).

    I've come across a work-around, stcrprep by Paul C Lambert https://pubmed.ncbi.nlm.nih.gov/30542252/ (PMID: 30542252). This command expands the dataset and assign weights, and is based on crprep in R.
    Geskus (2011) has recently proposed an alternative way to estimate the cause-specific CIF that uses weighted versions of standard estimators. A corresponding R command, crprep, which is part of the mstate package, has been developed that restructures the data and calculates the appropriate weights (de Wreede et al. 2011). This paper describes a Stata command, stcrprep, that has similar functionality to crprep with some further extensions to allow parametric models for the cause-specific CIF to be fitted. After using stcrprep a number of standard Stata survival analysis commands can then be used for the analysis of competing risks. For example, sts graph, failure will give a plot of the cause-specific CIF and stcox will fit the Fine and Gray proportional subhazards model.. - P C Lambert 2017
    Here is an example dataset with 10% (143 obs.) of my original dataset drawn without replacement:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int idnum double(age bmi colorscore vascularpattern_dicho) float(emborder echogenicity tumor_size extrauterine date_of_surgery primary_end_point date_of_primary_event esmo_pre_op_sub)
     262 52  23.5 3 0 2 2 67 1 19463 1 20578 1
    1329 71  25.4 4 1 2 1 47 1 20234 1 20269 1
     551 55 34.89 1 0 2 1 14 1 20424 0 22233 0
     458 72 59.49 4 0 1 2 17 1 18975 1 19479 0
    1384 59  37.1 1 0 1 2  9 0 19129 0 21978 0
    1400 75  33.2 5 0 3 2 57 0 19863 0 21937 1
    1397 73 38.44 2 0 2 2 14 0 19500 1 19739 1
    1495 56  31.9 1 0 2 2 15 0 20030 0 21875 0
     439 69 26.67 3 0 2 2 26 0 19344 0 21804 0
    1789 64  37.2 4 1 2 2 31 0 19988 1 20769 1
     111 64  30.4 2 0 2 2  8 0 19150 0 21783 0
    1036 58 34.22 2 0 2 2  9 0 19170 0 19237 1
    1591 58 30.11 4 0 2 2 26 0 19280 0 21893 1
    1594 59 23.43 3 0 2 1 21 0 19277 0 21851 0
     258 54 32.42 5 0 3 2  4 0 19084 0 21795 0
     892 61 26.44 1 0 2 2  6 0 19186 0 21857 0
     782 68  29.5 1 0 2 2 16 0 20112 0 21826 2
    1395 69 17.84 3 1 2 1 15 0 19493 0 21792 1
     875 61  40.3 3 1 2 2 17 1 20473 0 21850 1
     980 54  22.4 4 1 2 2 27 0 20274 0 21727 1
    1256 67  25.4 1 0 1 1  5 0 19801 0 21854 0
    1192 62  34.8 4 1 2 2 34 0 20156 2 21847 1
    1168 75 35.55 2 1 2 2 31 0 19429 0 21711 1
    1688 69 38.75 2 0 2 1  7 0 18899 2 20458 0
    1182 85  24.2 4 1 2 2 33 0 20151 2 21537 2
    1480 69  35.1 1 0 2 1 14 0 20057 0 21792 0
     853 56 27.18 2 0 2 2 14 0 19373 0 21801 0
     652 76  32.1 5 0 3 2 26 0 19781 0 21874 2
    1453 68 27.44 2 0 1 1  7 0 18947 0 21792 0
     318 43 30.44 3 1 1 2  9 0 18860 0 21808 0
     455 78 25.31 4 1 2 1 16 0 18987 0 21792 1
     197 66 23.43 2 0 2 2 10 0 19464 0 21566 0
     953 51  28.6 4 1 2 2 34 1 19546 0 21696 2
     929 70 32.46 1 0 2 1 10 0 18814 0 21787 0
    1687 72 27.99 1 0 2 1 11 0 18884 2 21517 0
     930 59 34.94 1 0 1 2  7 0 18814 2 21321 0
    1124 67  24.4 4 0 2 1 18 0 20241 0 21792 2
     365 80 30.44 1 0 2 2 24 0 19574 0 21810 1
    1717 68  29.1 4 1 2 1 31 0 20346 0 21894 1
     400 57  30.1 4 1 2 2 49 1 20304 1 21725 0
     475 67  25.7 4 1 2 2 23 0 20424 0 21833 1
    1379 61 31.62 1 0 1 1  8 1 18765 0 21787 0
     718 60  24.2 4 1 2 2 24 0 20136 0 21811 1
    1236 60 21.93 1 0 1 1  5 0 19057 0 21792 0
     227 59  18.3 3 0 2 1 11 0 20206 0 21867 1
     749 83 31.25 1 0 1 1  5 0 19374 0 21792 0
    1092 62 28.22 2 0 2 2 16 0 19148 0 21792 0
     904 83  29.7 3 0 2 2 26 0 19934 0 21993 1
     707 67  30.5 5 0 3 2 47 0 20135 1 20452 1
     548 58  44.4 2 1 2 2  9 0 20436 0 21647 0
     335 62 40.89 2 0 1 2 14 0 19574 0 21864 0
     945 85 32.05 3 1 1 2 30 0 19560 0 21864 1
     858 68  25.5 2 0 1 1  3 0 20109 1 20888 0
     260 47  40.4 1 0 1 1 18 0 19456 0 21864 0
    1492 68  27.8 4 1 1 2 27 0 19687 1 21410 2
     910 52    25 4 1 2 2 22 0 19928 0 21307 2
    1499 62  21.8 4 0 1 2 16 0 20414 1 20760 0
     536 70  31.1 3 0 2 2 27 0 20074 0 21929 1
     319 59 38.67 3 1 2 1 23 0 18855 1 20640 1
     888 83 24.97 3 1 2 2 63 0 18839 1 19240 2
     581 74  28.3 4 1 1 2 23 0 19035 0 21847 1
    1111 73  26.4 4 1 2 1 32 0 19899 0 22233 2
     465 73  24.5 4 1 2 2 65 0 19337 1 19446 1
     343 65  29.2 1 0 2 1  5 0 19599 0 21850 0
    1186 40  47.7 3 0 2 1 18 0 20163 0 21792 0
     790 57  20.9 3 1 2 1 12 0 19753 0 21840 1
      97 61  32.3 2 0 2 1  8 0 20227 0 21810 0
     451 50  22.7 1 0 2 2 12 0 20061 2 20089 0
    1566 38  35.3 4 0 2 1  8 0 20368 0 21871 0
     559 41  25.5 4 1 2 1 52 1 20086 0 21792 2
     247 67  25.2 1 0 1 2  6 0 20201 0 21767 0
    1730 84 35.15 3 0 1 2 11 0 19619 2 21832 0
    1356 68 30.75 4 0 1 2 10 0 19139 0 21621 0
     673 66  23.9 2 0 2 1  9 0 19416 0 21792 2
    1526 55  27.1 1 0 1 1 22 0 20057 0 22237 0
     468 56  45.7 3 0 1 2 25 0 19704 0 21376 1
    1289 86  30.1 1 0 1 1 20 0 20211 0 21868 0
     250 67  21.6 4 1 2 1 29 0 20184 1 21025 2
     917 70  21.9 4 1 2 2 24 0 20299 0 21937 2
    1216 59  27.5 3 0 2 2 23 0 20165 0 21924 1
     991 61  30.5 3 1 2 2 32 0 20458 0 21792 1
    1608 71 31.63 4 0 2 2 22 0 19659 0 21864 1
    1310 72  22.9 5 0 3 2  0 0 19855 0 22233 2
    1457 68 31.31 3 0 2 1 24 0 18946 0 21792 0
     623 62  36.6 1 0 2 2 21 1 19774 1 20647 1
     748 73 34.72 4 1 2 1 34 0 19367 0 21792 1
     514 78 39.25 3 1 2 2 24 0 19329 0 20832 1
     564 53  22.8 3 0 2 1 19 0 20071 0 21711 0
    1273 54  20.2 4 1 2 2 18 1 20219 1 21440 0
    1455 53 32.83 2 0 1 1 18 0 18944 0 21792 0
     824 69 26.61 3 1 2 2 20 0 19371 1 21188 1
     709 55  44.8 2 0 2 1 17 0 20134 0 21792 0
    1441 66  30.4 4 1 2 2 36 1 20422 1 20811 2
     931 55  49.6 1 0 2 1  9 0 18828 0 21787 1
     504 62 25.39 3 0 2 2 17 0 19340 0 21843 2
     571 71  40.4 3 1 2 2 25 1 20485 1 21332 1
    1790 76  28.4 1 0 2 1  1 0 19976 0 21804 0
     868 57  23.3 3 1 2 1 17 0 20103 0 21792 0
    1475 62  31.2 3 0 2 2 18 0 19668 0 21593 1
     290 45  27.6 3 1 1 2 22 0 20205 0 21138 0
     495 77 22.65 2 0 2 1  8 0 18979 2 19328 2
    1744 51  26.6 1 0 1 1  3 0 20332 0 21850 0
    1603 72 31.64 4 1 1 2 18 0 19296 0 22439 0
     347 75    34 1 0 1 2 10 0 19961 0 21809 0
    1306 62 29.13 3 0 2 2 25 1 19485 0 21811 1
     528 83    25 2 0 2 1 16 0 20067 2 21368 0
     679 61 45.26 2 0 2 2 27 0 19413 2 21220 0
    1784 60  29.3 3 1 1 2 11 0 19990 0 21864 0
     754 77 31.95 1 0 2 2 16 0 19373 0 21792 2
      59 84  22.3 3 1 2 1  8 0 20144 2 21705 1
     887 45  35.1 4 1 2 2 66 0 20288 0 21809 1
     382 64  29.4 3 1 2 2 24 1 19960 0 22233 1
    1658 61    43 1 0 2 1 28 0 19997 0 21792 0
     521 66  30.5 3 1 2 2 34 0 20060 0 21854 2
    1674 74 32.77 5 0 3 2  0 0 19245 1 19711 0
     529 64  40.6 2 0 2 1  9 0 20074 0 21792 0
     963 60  32.9 2 0 2 1  7 0 19912 1 20370 2
    1240 40  34.2 2 0 2 2 15 0 19444 0 21864 0
     170 74  24.2 3 0 2 2 16 0 20187 0 21448 0
    1617 72  29.7 3 1 2 2 21 0 19633 0 21115 1
     992 70  40.9 2 0 1 2 16 0 20260 0 21852 1
     632 62  27.6 3 0 1 1  9 0 20144 0 21851 0
     794 62  33.6 3 1 1 1 18 0 20100 0 21822 0
    1106 70  27.1 3 1 2 2 21 0 19875 0 21792 0
    1350 65  31.5 2 1 2 2 25 0 20239 0 22233 1
    1229 57 34.66 4 0 2 2 23 1 19059 0 21850 1
     672 70 32.87 3 0 1 2 26 0 19413 0 21792 1
     440 65 26.66 1 0 2 2  4 0 19353 0 21801 0
     810 59 29.29 3 1 2 2 19 0 18646 0 21850 0
    1019 60  33.2 3 1 2 1 20 0 20248 0 21858 0
     115 76 23.23 3 0 2 2  8 0 19508 0 21864 0
     986 78  24.2 3 1 2 2  0 0 20276 2 20912 1
    1408 71  33.7 3 0 2 2 19 0 20235 0 21946 0
    1610 62  22.6 3 0 2 2 20 1 19652 0 21431 1
     955 59 31.24 2 1 1 2 31 0 19569 0 21844 0
     358 71 29.38 1 0 2 2 15 0 18853 1 19730 0
     976 73  24.2 2 0 2 1  8 0 20282 0 21889 2
    1693 56 43.43 1 0 1 2  7 0 19241 0 22000 0
     682 76    43 1 0 2 1 10 0 19417 2 21262 0
     206 65  41.5 3 0 1 2 39 0 19835 0 21978 0
     619 74 30.38 3 1 2 2 26 0 19416 0 21850 1
    1171 50  21.6 3 0 1 2  9 0 19808 0 21792 0
     582 62 44.44 2 0 2 2 42 0 19403 2 20747 2
    end
    format %tdMon_DD,_CCYY date_of_surgery
    format %tdMon_DD,_CCYY date_of_primary_event
    label values vascularpattern_dicho vasc
    label def vasc 0 "All others", modify
    label def vasc 1 "Multiple, multifocal", modify
    label values emborder emborder
    label def emborder 1 "Regular", modify
    label def emborder 2 "Non-regular", modify
    label def emborder 3 "Endometrium not visible", modify
    label values echogenicity echo
    label def echo 1 "Uniform", modify
    label def echo 2 "Non-uniform", modify
    label values extrauterine yn
    label def yn 0 "No", modify
    label def yn 1 "Yes", modify
    label values primary_end_point prim
    label def prim 0 "No event", modify
    label def prim 1 "Primary event", modify
    label def prim 2 "Competing event", modify
    label values esmo_pre_op_sub preoplabel
    label def preoplabel 0 "Low risk", modify
    label def preoplabel 1 "Intermediate risk", modify
    label def preoplabel 2 "High risk", modify
    First, I try with native Stata commands, so that I will have "accurate" output to compare later:

    Code:
    *St-set the data
    stset date_of_primary_event, id(idnum) failure(primary_end_point == 1) origin(date_of_surgery) scale(365.24)
    
    findit mfpa //st0425, unless you already have it installed.
    
    *Run MFPA-algorithm, fit a Fine-and-Gray model. Force esmo_pre_op_sub into the model.
    
    mfpa, cycles(3) dfdefault(3) select(0.157, i.esmo_pre_op_sub:1) alpha(0.25) : ///
    stcrreg i.esmo_pre_op_sub c.age i.colorscore i.extrauterine ///
    i.vascularpattern_dicho i.emborder i.echogenicity c.bmi c.tumor_size ///
    , compete(primary_end_point==2)  ///The MFPA-algorithm is very slow, for the full dataset each run takes about 7 min.
    
    *Draw the CIF, by esmo_pre_op_sub group
    stcurve, cif at1(esmo_pre_op_sub=0) at2(esmo_pre_op_sub=1) at3(esmo_pre_op_sub=2)
    the stcurve produces this image:
    Click image for larger version

Name:	Graph_stcrreg.jpg
Views:	3
Size:	41.7 KB
ID:	1740385




    and then with stcrprep and stcox:

    Code:
    * First, data must be st-set to later be rearranged, expanded and assigned weights.
    stset date_of_primary_event, id(idnum) failure(primary_end_point == 1, 2) origin(date_of_surgery) scale(365.24) //Both failcodes has to be included in the failure() option
    
    findit stcrprep //st0471, unless you already have it installed
    
    stcrprep, events(primary_end_point) keep(esmo_pre_op_sub extrauterine age colorscore vascularpattern_dicho emborder echogenicity bmi tumor_size) trans(1 2) //Here I'm omitting the byg() option, since I want the same output as for stcrreg. See Lambert for details; "The original data is now reloaded and stcrprep is run again, but this time the censoring distribution used to generate the weights is calculated for the cohort as a whole, i.e. the byg() option is not used. This is because we want to fit the same model as that fitted by stcrreg, which does not calculate weights separately by subgroup."
    
    gen event = primary_end_point == failcode
    stset tstop [pw=weight_c], failure(event) enter(tstart) noshow //"If one uses pweights rather than iweights, along with using stset and the vce(cluster patid) option for the stcox command, then the standard errors are the same as stcrreg to four decimal place". quote from Lambert.
    
    mfpa, cycles(3) dfdefault(3) select(0.157, i.esmo_pre_op_sub:1) alpha(0.25) : ///
    stcox i.esmo_pre_op_sub c.age i.colorscore i.extrauterine ///
    i.vascularpattern_dicho i.emborder i.echogenicity c.bmi c.tumor_size if failcode==1 ///
    , vce(cluster idnum) //After using stcrprep, stcox will do the same as stcrreg did before transforming the data.
    
    stcox, hr
    
    sts graph if failcode==1, by(esmo_pre_op_sub) failure ///
                  ytitle("Cumulative incidence") ///
                  xtitle("Years since surgery") ///
                  ylabel(0(0.1)0.5, angle(h) format(%3.1f)) ///
                  legend(order(1 "Low Risk" 2 "Medium Risk" 3 "High Risk") ///
                  cols(1) ring(0) pos(5)) ///
                  scheme(sj) name(cif_relapse, replace)
    and the graph looks like this:
    Click image for larger version

Name:	cif_relapse_stcrprep.jpg
Views:	3
Size:	38.8 KB
ID:	1740386


    We see that the SHRs, z-scores, p-values and all of the output is identical. So the stcrprep-->stcox seems to be working. But the graphs don't look the same.

    I e-mailed Paul C Lambert and he replied :
    Your first do file uses stscurve, so is not the same thing as using sts graph in the 2nd do file, you are comparing model based covariate specific estimate with the non-parametric Aalen Johansson estimate. If you compared the sts graph estimates with stcompet you should get the same estimates.
    So I get that they are not the same, so it should not be surprising that the graphs don't look the same. But which is the correct one to calculate and report in my paper? Which would you "normally" report when the aim is to develop a Fine-and-Gray prediction model of recurrence of cancer?

    This was my first problem, we can call that "the stcrprep problem".

    Here comes my second problem, which we can call "the internal validation problem".

    How would you go on about assessing internal validation of this model? I've seen so many different recommendations. van Gerloven et al in Validation of prediction models in the presence of competing risks: a guide through modern methods (https://www.bmj.com/content/bmj/377/...69249.full.pdf) presents a calibration plot (Fig 1) drawn by using psedo-observations to compensate for censoring, and they provide R code on their Github page (https://github.com/survival-lumc/Val...readme-ov-file) but I have no idea how to do this in Stata (or in R for that matter). Pseudo-values are also used (and maybe first time introduced?) in Calibration plots for risk prediction models in the presence of competing risks by Gerds, Andersen and Kattan (https://pubmed.ncbi.nlm.nih.gov/24668611/). Unfortunately, this paper does not come with an Appendix with R or Stata code, and I cannot make use of the mathematical formulae presented in the paper. And then we have Graphical calibration curves and the integrated calibration index (ICI) for competing risk models by Austin, Putter, Giardiello and van Kleveren (https://diagnprognres.biomedcentral....12-021-00114-6). In the manuscript they refer to R code available in the appendix, but it is actually lacking in the appendix. So, understand my frustration! The same goes for discrimination, there doesn't seem to be a straight-forward way to assess that either (Gönen and Hellers C-statistic, Wolberts method, Somers D etc etc) and how to practically do it in Stata. And especially how to do it after first expanding the data using stcrprep!

    I suppose since the Fine-and-Gray model calculates the CIF, the most intuitive way to assess calibration would be to plot the CIF obtained from the model (with covariates set at their mean, or maybe mode, or maybe for a typical 'low risk case' and a typical 'high risk case') and overlay the actual observed cumulative incidence. But that is not what I've seen recommended (see the papers above), and I don't know why. Seems like I'm missing something here.

    Another way to assess calibration I've seen is to make a prediction at, say 3 years, and plot at calibration plot as one normally would for a logistic regression, that is, have predicted risk in bins of deciles on the x-axis and the mean observed risk for those cases on the y-axis. But to me it seems counterintuitive to first develop a model that calculates time-to-event, only to then assess the models performance at only one or a few timepoints. Then why not fit a logistic model in the first place and change the outcome to event prior to 3 years yes/no?

    So, in summary:
    1. Why am I getting different CIFs when doing stcrreg-->stcurve, cif compared to stcrprep-->stcox-->sts graph, failure? Or rather, which is the more appropriate way to present the CIF after fitting a Fine-and-Gray model?
    2. How would I go on about assessing internal validation with calibration and discrimination after fitting a Fine-and-Gray model after expanding the data using stcrprep?



  • #2
    Post triggered the spam filter, administrative bump.

    Comment

    Working...
    X