Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to interpret Cox outputs with time-varying coefficients?

    Hello,
    I am studying mortality at 5-years and running an stcox model with the ,tvc option. However, I am not exactly sure how to interpret this model output in words, and I am also not sure how to compute appropriate confidence intervals.

    My model output:

    Code:
    -----------------------------------------------------------------------------------------
                         _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ------------------------+----------------------------------------------------------------
    main                    |
                   subgroup |
                         1  |   .3866378   .0437078    -8.41   0.000     .3097986    .4825353
                         2  |   .5724428   .0713563    -4.48   0.000     .4483611    .7308636
    
    
    tvc                     |
                   subgroup |
                         1  |    1.15097   .0633358     2.56   0.011     1.033294    1.282047
                         2  |   1.260569   .0780036     3.74   0.000     1.116592    1.423111
    -----------------------------------------------------------------------------------------
    Note: Variables in tvc equation interacted with _t.
    Would it be correct to say, relative to the reference group:
    Group 1 has a HR for mortality of 0.39 [CI 0.31-0.48] at time zero, but a HR of (0.39) * (1.15)^5 = 0.72 at time 5?
    Group 2 has a HR for mortality of 0.57 [CI 0.49-0.73] at time zero, but a HR of (0.57) * (1.26)^5 = 1.70 at time 5?

    I think I am confused because of the direction change I see in the HR for Group 2, where it confers a protective effect at time 0, but then is linked with increased risk of mortality at time 5.
    Meanwhile, for the confidence intervals, should I also be computing them in the same method?

    Thank you so much.

  • #2
    Yes, for the time-specific hazard ratios, what you have is correct. It works just like any other interaction effect. But you cannot apply the same to the confidence intervals. Even if you could, rather than doing this all by hand, it is easier and less error-prone to use -lincom- for this.

    Code:
    lincom _b[main:1.subgroup] + 5*_b[tvc:1.subgroup], hr
    lincom _b[main:2.subgroup] + 5*_b[tvc:2.subgroup], hr
    Note: As per the Forum FAQ, as you did not say what version of Stata you are running, I am assuming, and writing code for, the current version, 18. Sometimes Stata changes the way it names regression coefficients in some estimation procedures when versions change. So if you are using an older version, the contents of _b[] that I have specified here may not be recognized and Stata will give you an error message. If that happens, to find out what the correct way to refer to the various coefficients is, run -stcox, coeflegend- and Stata will replay the results table, but it will show the coefficient names rather than standard errors and other inferential statistics in the output.

    I think I am confused because of the direction change I see in the HR for Group 2, where it confers a protective effect at time 0, but then is linked with increased risk of mortality at time 5.
    There is nothing unusual or surprising about this. As with any other interaction between a discrete and continuous variable, the direction of the effect of the discrete variable can change with a sufficiently large change in the continuous variable (_t in this case).
    Last edited by Clyde Schechter; 12 Oct 2023, 20:26.

    Comment


    • #3
      Hi Dr. Schechter- thank you so much for taking the time to comment and help me, I am so appreciative.

      I'm also sorry for not specifying, I'm using Stata 16.1.

      I ran the code you were kind enough to write, and it returned these outputs:
      Code:
      . lincom _b[main:1.subgroup] + 5*_b[tvc:1.subgroup], hr
      
       ( 1)  [main]1.subgroup + 5*[tvc]1.subgroup = 0
      
      ------------------------------------------------------------------------------
                _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               (1) |   .7369259   .1404727    -1.60   0.109     .5071868     1.07073
      ------------------------------------------------------------------------------
      
      . 
      . lincom _b[main:2.subgroup] + 5*_b[tvc:2.subgroup], hr
      
       ( 1)  [main]2.subgroup + 5*[tvc]2.subgroup = 0
      
      ------------------------------------------------------------------------------
                _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               (1) |   1.695164   .3691953     2.42   0.015      1.10618    2.597752
      ------------------------------------------------------------------------------
      I am struggling with what to do with this. Per these confidence intervals, the HR at time 5 for subgroup 1 crosses 1, so therefore I interpret that as equivalent survival to the reference group. For group 2, at time 5 there appears to be greater mortality, compared to reference.

      This does not conform with my unadjusted Kaplan Meier- I understand this may be due to adjustment. However, before I figured out the ,tvc() option for stcox as a workaround for the Cox proportional hazards violation, I applied multivariable accelerated failure time models using a log normal distribution.

      I've structured my model as such:
      Code:
      streg i.subgroup [model covariates added here], distribution(lognormal) time tratio allbaselevels cformat(%9.3f) pformat(%5.3f) sformat(%8.3f)
      When I consider this model, I see the following output:

      Code:
      -----------------------------------------------------------------------------------------
                           _t | Time Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
      ------------------------+----------------------------------------------------------------
                     subgroup |
                           0  |      1.000  (base)
                           1  |      2.136      0.132   12.266   0.000        1.892       2.412
                           2  |      1.478      0.117    4.919   0.000        1.265       1.727
      I have interpreted this to mean both subgroups 1 and 2 confer a survival advantage, relative to subgroup 0 (reference). Yet, how does this agree with the confidence intervals of my stcox, tvc model?

      Thank you so much again.

      Comment


      • #4
        Per these confidence intervals, the HR at time 5 for subgroup 1 crosses 1, so therefore I interpret that as equivalent survival to the reference group.
        That is the conventional, standard misinterpretation of non-statistically significant results that, undoubtedly, you were taught in your statistics courses. The correct interpretation is that relative to the reference group, the data are consistent with the hazard being anywhere from substantially lower to slightly higher in subgroup 1. It is definitely not possible to assert that the hazards in the two groups are the same. Their being the same is merely one possible outcome, one with zero probability.

        Once you change a model in any way, the results can change in any way. If that weren't the case, there would be no point to adjusting models in the first place. The -streg- results are interesting but have no bearing on what to make of the -stcox- results. Not only is the -streg- model adjusted, but it is a model with a specified underlying parametric hazard function. There is no reason to expect them to agree. Given that the -stcox- model is less constrained than the -streg- model, I would usually lean towards giving -stcox- more credence. On the other hand, if the covariates were appropriately chosen, I lean towards adjusted models over crude models in observational studies. So I wouldn't even try to form an opinion about which model is more credible here.

        Comment


        • #5
          Hi Dr. Schechter,
          Thank you so much, all of this is incredibly helpful. To confirm, if I am constructing multivariable -stcox- and -streg- models, which one would be more robust for an observational study? I was under the impression that parametric accelerated failure time models offer better goodness of fit in the setting of a violation of the proportional hazards assumption, relative to a normal Cox model, but I wasn't sure how the ,tvc() option changes that.

          I had one more question related to this. When I do a landmark analysis and only consider patients who survived the first year, there is no longer a violation of the Cox model. For this sub-group, is it appropriate to run a normal stcox?

          Thank you so much!

          Comment


          • #6
            Well, because AFT models do not estimate a hazard ratio, their applicability does not rely on a proportional hazards assumption. But nothing comes for free in statistics. The concept underlying an AFT model is that variables exert their effects by, in a sense, making time pass at a faster or slower rate. And if you do not include an interaction of those effects with time, then the underlying assumption is that this ratio of time passage rates is constant. So this assumption is very much analogous to the proportional hazards assumption. It is a constant time rate ratio assumption. So you don't really escape from proportional hazards with these models: you just trade that assumption for a different assumption.

            I haven't used any AFT models in my work, at least not in the last couple of decades, so I don't know how one tests this assumption--testing the interaction between a model parameter and time would be one generic way to do that. But anyway, the notion that using a parametric AFT model is a "get out of jail free" card vis a vis proportional hazards is simply not true.

            Violation of the proportional hazards model assumption is just one form of model misspecification in a survival analysis. Using -tvc()- with a Cox model deals with the proportional hazards model to the same extent that including interaction terms in any other kind of regression model deals with model misspecification. It resolves the problem to the extent that the -tvc()- specification is itself correct. It may or may not be, as with any interaction. Just as adding an interaction with a continuous variable to a regular regression might be inadequate because you need to also interact with, say, the square of the continuous variable, it might be that in a survival analysis you need to interact with the log of _t rather than _t itself, or with some other transform. (Generally, you wouldn't expect an interaction with _t2, but I suppose anything is possible.)

            As for your second question, if the analysis restricted to those who survive the first year, carried out without -tvc()- exhibits no proportional hazards violation, then, yes, you can use the plain -stcox- analysis for this.

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              Just as adding an interaction with a continuous variable to a regular regression might be inadequate because you need to also interact with, say, the square of the continuous variable, it might be that in a survival analysis you need to interact with the log of _t rather than _t itself, or with some other transform. (Generally, you wouldn't expect an interaction with _t2, but I suppose anything is possible.
              Thank you so much. I am no longer confident an interaction with _t alone is sufficient. However, I now have two questions. First, is there a specific test to evaluate this? And second, if I were to hypothetically want to interact with log of _t. Do I need to log transform my entire stset, and then proceed with tvc()?

              Thank you again!

              Comment


              • #8
                First, is there a specific test to evaluate this?
                You can use -estat phtest-. But, -estat phtest- cannot be used after -stcox, tvc()-. So if you want to go down this route, you have to -stsplit- the data set at failures, and then you need to redo the -stcox- using a hand-crafted interaction term (interaction of PH-violating variable(s) with _t), not the -tvc()- option.

                And second, if I were to hypothetically want to interact with log of _t. Do I need to log transform my entire stset, and then proceed with tvc()?
                No, it's much simpler than that. You don't have to meddle with -stset- at all. In addition to specifying -tvc(varlist)- also specify the option -texp(log(_t))-. You can put any expression you like into -texp()-. Of course, if you are going to test PH again after this, then you can't use -tvc()-. But it's the same process overall. -stsplit- the data at failures, and then in your -stcox- add an interaction term -phv#c.lnt- where phv is your proportional hazards violating variable, and you have previously created variable lnt with -gen lnt = log(_t)-. (If more than one variable violates PH, include an interaction term for each.)

                Added: I am not a fan of statistical significance tests in most contexts, and I particularly dislike them for purposes of selecting models. In this situation, I would not be inclined to rely on a PH test and more likely to rely on -estat gof- which provides a graphical goodness of fit plot. The advantage of doing this visually is that you get a sense of just how large the deviations from fit are and can judge whether they are large enough to matter for practical purposes. This is especially important if your sample size is large because in that context goodness of fit tests tend to reject models for trivial deviations, even deviations so small that they are nearly or actually visually imperceptible with goodness of fit graphs.
                Last edited by Clyde Schechter; 16 Oct 2023, 19:07.

                Comment

                Working...
                X