Surival analysis in Stata-stratification on a predictor

Diana Soria

Join Date: Nov 2016

Posts: 9
#1

Surival analysis in Stata-stratification on a predictor

06 Oct 2021, 10:23

Hello!

I conducted a survival analysis using stata 16. One of my predictor variables did not meet the proportionality assumption and I decided to stratify on that variable (I assessed the survival function by the stratum of this variable, and run other descriptive tests before deciding to stratify). After this, all the predictors in the model met the assumption. This is an example of how the model/results look like.

The problem is that I'm not sure how to interpret this and how to describe it in the methods of my manuscript.
Does anyone have a recommendation?
. stcox i.inf_all_pregs_ge35y_6m i.race_3cats i.gt70k i.smokpreg_final_d bmi_mom_prepreg_d, strata (coll_grad) failure _d: meno_natural == 1 analysis time _t: meno_natural_agey id: id Iteration 0: log likelihood = -995.2653 Iteration 1: log likelihood = -992.57214 Iteration 2: log likelihood = -992.56409 Iteration 3: log likelihood = -992.56409 Refining estimates: Iteration 0: log likelihood = -992.56409 Stratified Cox regr. -- Breslow method for ties No. of subjects = 594 Number of obs = 594 No. of failures = 207 Time at risk = 29385.2 LR chi2(7) = 5.40 Log likelihood = -992.56409 Prob > chi2 = 0.6110 ---------------------------------------------------------------------------------------- _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] -----------------------+---------------------------------------------------------------- inf_all_pregs_ge35y_6m | yes | 1.169095 .169324 1.08 0.281 .8801718 1.55286 | race_3cats | black | 1.08893 .2801552 0.33 0.741 .6576682 1.802991 other | .7923159 .1704014 -1.08 0.279 .5197956 1.207714 | gt70k | yes | .8327365 .1352858 -1.13 0.260 .6056504 1.144968 | smokpreg_final_d | smoke preg | 1.294729 .4279536 0.78 0.435 .6773709 2.474748 xnever | 1.124641 .1955035 0.68 0.499 .7999152 1.581188 | bmi_mom_prepreg_d | .9853377 .0153303 -0.95 0.342 .9557443 1.015847 ---------------------------------------------------------------------------------------- Stratified by coll_grad
Tags: None
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#2

06 Oct 2021, 13:49

For best formatting, you can copy the results from your Stata window, then paste them within the code delimiters. On the editing bar, tap the # button, then just paste the results within the brackets. I know that they're code delimiters, but they will also properly format the regression table. Unfortunately, it's generally not possible to take someone else's post, copy that, and paste things in the code delimiters and see clear results.

In general, though, you can say something like Black race was associated with a hazard ratio of 1.089 compared to the reference group, p = 0.741 (I think?). A higher hazard ratio means that Black patients are more likely to fail than the reference group. 1.08 isn't a very large excess risk.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment

Diana Soria

Join Date: Nov 2016
Posts: 9

07 Oct 2021, 17:12

Thanks, Weiwen.
These are my code and results.
My question is, how can I interpret the stratification by coll_grad? as you can see I don't even get a HR for coll_grad; and how can I describe it in the methods? (i.e., I have a model that includes this x, x, x predictors, and is stratified by coll_grad?)

Thanks,

Code:

stcox i.inf_all_pregs_ge35y_6m i.race_3cats i.gt70k i.smokpreg_final_d  bmi_mom_prepreg_d, strata (coll_grad) 

         failure _d:  meno_natural == 1
   analysis time _t:  meno_natural_agey
                 id:  id

Iteration 0:   log likelihood =  -995.2653
Iteration 1:   log likelihood = -992.57214
Iteration 2:   log likelihood = -992.56409
Iteration 3:   log likelihood = -992.56409
Refining estimates:
Iteration 0:   log likelihood = -992.56409

Stratified Cox regr. -- Breslow method for ties

No. of subjects =          594                  Number of obs    =         594
No. of failures =          207
Time at risk    =      29385.2
                                                LR chi2(7)       =        5.40
Log likelihood  =   -992.56409                  Prob > chi2      =      0.6110

----------------------------------------------------------------------------------------
                    _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
inf_all_pregs_ge35y_6m |
                  yes  |   1.169095    .169324     1.08   0.281     .8801718     1.55286
                       |
            race_3cats |
                black  |    1.08893   .2801552     0.33   0.741     .6576682    1.802991
                other  |   .7923159   .1704014    -1.08   0.279     .5197956    1.207714
                       |
                 gt70k |
                  yes  |   .8327365   .1352858    -1.13   0.260     .6056504    1.144968
                       |
      smokpreg_final_d |
           smoke preg  |   1.294729   .4279536     0.78   0.435     .6773709    2.474748
               xnever  |   1.124641   .1955035     0.68   0.499     .7999152    1.581188
                       |
     bmi_mom_prepreg_d |   .9853377   .0153303    -0.95   0.342     .9557443    1.015847
----------------------------------------------------------------------------------------
                                                       Stratified by coll_grad

. 
end of do-file

Comment

Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#4

07 Oct 2021, 17:42

I see the confusion now.

If you have the hazard curves for two values of a binary predictor cross, then this violates an assumption of the Cox model. Moreover, no single estimate of the hazard ratio will give an accurate picture. It’s more like the hazard ratio is positive up to a certain point and negative thereafter.

stratification by the covariate removes that as an issue, but the price you pay is that you don’t get to see the effect of the covariate in question. If the covariate was a substantively important one and you need to see the HR for that covariate, you’d want to use it as a time varying covariate. It’s as if you were fitting separate Cox models by college graduate status.

in the methods, you can just say something like: the model was stratified by x, y, and z because the hazard functions crossed. You can use something like the sts graph command before fitting the model to look at this, or you could have used one of the post-estimation tests. I forget the names of those tests right now.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment

Announcement

Surival analysis in Stata-stratification on a predictor

Comment

Comment

Comment