Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • stpm2cr - Help With Model Coding & Plotting

    Hi everyone,

    At the suggestion of the group and the consent of my committee, I successfully used -stpm2- to model mortality outcomes as part of my dissertation. For my next effort, I want to use -stpm2cr- for a competing risks analysis, as I've got some other outcomes to model that compete with death. However, I'm struggling to get the code written properly and in doing so, be knowledgable to what I'mdoing. After not having much success with this on my own, I'm hoping to see if someone here can help me check out my code and set me on the right path. Please forgive the length of this as I'm hoping to paint as comprehensive a picture as I can. I'm using Sarwar Mozumder's 2017 paper as my guide, specifically s.5.5 'Time-dependent effects'. My study looks at two treatments and I'm looking at outcomes over time - in this instance, my outcome is a composite of CV outcomes that competes with death.
    Code:
    *----------------------------------------------------------------------------------------*; 
    * Declare data to be survival-time data, with ID variable, with exit set at 36 months
    
    stset tidxlstaged2CVO_m, id(PID) failure(statusCVODTH3yrCR==1, 2) exit(time 36) // First CV Outcome = "1", Death = "2"
    Here I've -stset- the data with a failure variable that consists of either a CV outcome or death at three years.
    Code:
    *----------------------------------------------------------------------------------------*; 
    * Fit a flexible parametric competing-risks model using a direct likelihood approach for the cause-specific cumulative incidence function
    
    stpm2cr ///
        [CVOutcome: studygrp, scale(hazard) df(5) tvc(studygrp) dftvc(5)] ///
        [Death: studygrp, scale(hazard) df(5) tvc(studygrp) dftvc(5)], ///
            events(statusCVODTH3yrCR) cause(1 2) censvalue(0) eform level(95)
    Here I run the -stpm2cr- model setting up the two equations - one for CV outcomes and the other for death. I originally had a censored value (-censvalue-) at -3- but the model threw errors. It wasn't until I recoded the variable to have censored values equal -0- that the model ran successfully. This seems to have worked and I get some coefficients that make some sense with my data.
    Code:
    *----------------------------------------------------------------------------------------*; 
    * Predict cumulative incidence functions & subhazard ratios
    
    * Create a time variable for the purposes of prediction.
    
    range figuretime 0 36
    It's not clear from the documentation if I need this variable or not but the prediction code didn't run properly without it. The documentation suggests you need it with big data sets and mine isn't that, at least I don't think it is (~25K subjects). The tutorial has three values here (0 15 1000) and the figures generated all stop at 15, so maybe it truncates the data, but I'm not honestly sure if that's true or if I need it. My study has ~25K subjects, for what that's wroth. Would welcome guidance on this.

    Code:
    * Predict cumulative incidence functions.
    
    predict cif1_tvc1_CVO, cif at(studygrp 0 studygrp 1) ci timevar(figuretime)
    predict cif2_tvc2_CVO, cif at(studygrp 1 studygrp 0) ci timevar(figuretime)
    
    * Predict cumulative incidence function difference.
    
    predict cifdiff_CVO, cifdiff1(studygrp 0 studygrp 1) cifdiff2(studygrp 1 studygrp 0) ci
    
    * Predict cause-specific hazard ratio.
    
    predict chr_CVO, chrn(studygrp 0 studygrp1) chrd(studygrp 1 studygrp 0) ci
    
    * Predict subdistribution subhazard ratio fir a specific cause.
    
    predict shr_CVO, shrn(studygrp 0 studygrp 1) shrd(studygrp 1 studygrp 0) ci // shrn = subhazard ratio numerator; shrd = subhazard ratio denominator
    When I run this code, I get a set of new variables in Stata. What I am unsure about is if I am coding the variables in the predict lines properly. For example, the tutorial uses -stage1- and -stage2- which look like the study groups. I'm doing the same thing here yet, but I'm not sure its correct. In any event, I end up with ten predictions and their confidence intervals after all of this:
    • cif1_tvc1_CVO_c1
    • cif1_tvc1_CVO_c2
    • cif2_tvc2_CVO_c1
    • cif2_tvc2_CVO_c2
    • cidiff_CVO_c1
    • cidiff_CVO_c2
    • chr_CVO_c1
    • chr_CVO_c2
    • shr_CVO_c1
    • shr_CVO_c2
    I'm not sure why I have two prediction values for each -predict- line. Does anyone know the significance of -c1- compared to -c2- for each prediction?

    I think this is also at the root of my plotting difficulties. I'd like to produce plots that compare the two study groups. So far, my efforts to do this just aren't working. The code I have written doesn't produce anything intelligible and I think it's because I don't understand what the generated variables and their labels, specfically -c1- and -c2-, is. Unfortunately, the paper doesn't include code to direct the plotting novice so guidance on what to plot would be appreciated.

    I'll stop there for now - thanks for any suggestions you may have! Very grateful to all for reading and considering my puzzle!



  • #2
    That code seems close.

    The line
    range figuretime 0 36 Makes a new variable called figuretime, incrementing progressively from 0 to 36. If you did (0 15 1000) as you say, it goes from 0 to 15, in 1000 steps. You'd do that if you wanted to predict from time 0 to time 15 on the x-axis, in very teeny x-axis increments. You need that variable because it will form the x-axis for your plots, telling Stata at what timepoints do you wish to predict the CIF. And you need to run that command BEFORE you run the predict statement, so that predict knows at what times you want to predict.

    I would simplify your predict statement to

    predict cif1_tvc1_CVO, cif at(studygrp 0) ci timevar(figuretime)

    predict cif2_tvc2_CVO, cif at(studygrp 1) ci timevar(figuretime)

    The way you wrote it...at(studygrp1 studygrp0)...doesn't make sense to me. You are telling it what covariate mixture to predict at. studygrp should be EITHER 0 OR 1, but can't be both like you are telling it do in your code. So, above the first predict statement gives you predictions at studygrp=0 and produces the variable called cif1_tvc1_CVO_c1 that you see. The second predict statement above gives you predictions at studygrp=1 and produces a variable called cif2_tvc2_CVO_c1 that you see.

    The _c1 means cause1 (as opposed to the competing risk, which you might have coded as cause=2). You could put an option on your predict statements like cause(1) to be clearer about this. It means you're predicting your main outcome, and not the competing risk.

    To get the graph...easy.

    twoway ///
    line cif1_tvc1_CVO_c1 figuretime, lcolor(blue) || ///
    line cif2_tvc2_CVO_c1 figuretime, lcolor(red)

    Notice how figuretime is the x-axis, whereas the predict command created variables for your y-axis. And each column represents the CIF at a different value of studygrp.

    You could also do a line or rarea plot to plot confidence intervals around each line.
    Last edited by Sam Terman; 10 Apr 2024, 16:48.

    Comment

    Working...
    X