Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multilevel mixed effects model: estimating mean rate of change for each exposure group

    Hello,

    I am working on a study where a group of individuals have a CT scan for a specific measurement (let's call it ct) at baseline (t0), then at 2 more follow-up (t1 and t2). The interval for follow-up was not uniform across individuals, and the time between t0 and subsequent followups is represented by variable time (a continuous variable). These individuals are assigned into 3 groups based on their comorbidities, age, etc, which is represented by variable group (values 1, 2, 3). Each individual is assigned a number (id). There are covariates but let's simplify things by just putting one for now (covar).

    I am interested in comparing the change in ct over time between groups, which, as I understand, can be done with the following code to allow for random effects at the individual level (there is no reason to suspect random slopes I think):

    Code:
    mixed ct i.group##c.time covar || id:, covariance(unstructured)
    from which the coefficient, 95% CI and p-value for the i.group#c.time will be what I am looking for primarily, i.e. how groups 2 and 3 compare to group 1 in terms of the rate ofchange in ct over time, respectively.

    However, I also want to report the mean estimated rate of change in ct (per unit time) for each group, i.e. not only the difference between groups, but also the absolute value of the mean rate of change for each group. I suspect this can be done using something under the 'predict' postestimation command, but what should the code be? I'm admittedly not familiar with postestimation commands here so would appreciate expert help around here. (Of course, if any of the above is incorrect, please do kindly correct me too!)

    Thank you very much in advance.

  • #2
    Code:
    margins group, dydx(time)
    For the absolute values, just ignore the sign.

    A couple of asides:
    there is no reason to suspect random slopes I think
    Seriously? You don't think that individuals in the same group could differ in the rate of change over time of the ct outcome? You believe your group classification completely accounts for 100% of the differences between people in this rate of change? Color me deeply skeptical. If you want to say that you think your group classification is so detailed and comprehensive that the remaining inter-individual variation is too small to bother with, well, fine. But if you try to publish this or present it to an audience, you will have to build a case for that.


    Comment


    • #3
      If you assume this model, it means that you are allowing time to vary linearly, but that within each group, those members have a common group trajectory. You won't need the covariance() option since you do not specify any random slopes, only the random intercept for participants.

      The estimated mean difference between groups needs specification at a specific time.
      The estimated rate of change for each group is given by the linear combination of time and time-by-group interaction.

      The following code is a sketch, but is not tested as no example data was provided. Typos or errors may exist.

      Code:
      lincom _b[time] + _b[0.group#c.time]  * rates of change for each group
      lincom _b[time] + _b[1.group#c.time]
      lincom _b[time] + _b[2.group#c.time]
      margins i.group, at(time==1)   * estimated mean of groups at fixed time
      margins , dydx(group) at(time==1)   * estimated mean difference between groups at fixed time

      Comment


      • #4
        I agree with Clyde on this. Additionally, there is compelling methodological research to show that cross-level interactions in which a time varying variable (time in your case) is interacted with a time-invariant variable (group) can report overly optimistic standard errors unless the time varying variable is allowed to vary across indivduals/clusters. This issue was also blogged recently by Gelman. You can always use model comparison (lrtest in Stata) to provide further evidence for/against the inclusion of the random slope. You are going to find it difficult to sell the model you are proposing unless you can prove to reviewers that the random slope model does not fit the data as well.

        Comment


        • #5
          Thanks all. I appreciate your point on random slope and will look to incorporate it. In this case, how should it be specified? Given that the independent variables can all have different effects on each person, would it make sense to allow random slope for all of them? From my understanding that would require the following code (I'm not sure if the random effects part accept '##' so I have expanded the terms):
          Code:
          mixed ct i.group##c.time covar || id: i.group i.group#c.time time covar, covariance(unstructured)
          Additionally, I am curious about the different answers Clyde Schechter and Leonardo Guizzetti gave for the rates of change for each group. I wonder if
          Code:
          margins group, dydx(time)
          would give the same results as the 'lincom'-based codes?

          Thanks a lot. I really appreciate your kind help.

          Comment


          • #6
            Yes, the -lincom- approach and the -margins group, dydx(time)- will give the same results.

            As for the random slopes, including time is pretty much obligatory, and including covar probably makes sense, at least if the number of covariates is small enough for this to be manageable. But adding i.group doesn't make a lot of sense here, because the random slope on i.group will be entangled with the random intercept at the individual level and will probably make the model unidentifiable and non-convergent. I think similar reasoning applies to the i.group#c.time term.

            Comment


            • #7
              Thanks so much Clyde for reverting so quickly. This was very helpful!

              Comment


              • #8
                Clyde Schechter Leonardo Guizzetti I hope you don't mind me asking a follow-up question...

                Suppose the -group- variable has 4 levels instead of 3, so values 1-4 (integers). Now, there is an additional continuous explanatory variable of interest, -marker-. The intentions here are to examine whether the groups differ in the relationship between -marker- and the rate of change in -ct-, and whether the group 1 vs group 2 difference in relationship between -marker- and the rate of change in -ct- differs between the same between group 3 and group 4.

                In other words:
                Objective 1: whether marker-ct relationship for group 1 is the same for group 2 / 3 / 4
                Objective 2: whether (marker-ct relationship for group 1) - (marker-ct relationship for group 2) = (marker-ct relationship for group 3) - (marker-ct relationship for group 4)

                I have thought about this for quite a bit now and I cannot seem to eliminate the need for a 3-way interaction term, which I have absolutely no experience with, frankly. The code I sort of conjured up is as follows:
                Code:
                mixed ct c.marker##group##c.time covar || id: time covar marker, vce(unstructured)
                However, I really have not much idea how I can proceed from this point and test the hypothesis above. I thought about using margins to estimate the "marker-ct relationship" for each group and then using -test- to test the above hypotheses, but I'm 1) not sure if this is a valid idea and, 2) not sure how to use margins to estimate the said relationship in the first place.

                If it is helpful, -group- is actually 4 possible combinations from the interaction between binary variables -age- and -sex-, so basically younger females, older females, younger males, and older males. The groups were not matched / paired. On this note, I'm not completely sure if -group- should be specified as a higher level with random intercept / slope either, despite the biological plausibility.

                I would be really grateful for your expert advice on this matter. I hope I have made myself sufficiently clear above, and thank you so much in advance!!

                Comment


                • #9
                  Yes, this is a 3-way interaction problem. And 3-way interactions can get pretty confusing. But the -margins- command makes it simpler.
                  Code:
                  margins group, dydx(marker) post
                  // Objective 1
                  testparm i.(2/4)group
                  
                  // Objective 2
                  lincom (4.group - 3.group) - (2.group - 1.group)
                  Note: You may get a -(not estimable)- response from the -margins- command. If all values of group are represented in the estimation sample and there are adequate numbers of observations with non-missing values of time and marker, there is no reason why these marginal effects should be unidentified that I can see. So I would go ahead and add the -noestimcheck- option to the -margins- command if you encounter this difficulty.

                  Comment


                  • #10
                    Thank you very much Clyde!!!

                    Comment


                    • #11
                      Clyde Schechter please forgive me for asking yet another follow-up question.

                      I want to return to the first scenario above that I had put forth:
                      I am working on a study where a group of individuals have a CT scan for a specific measurement (let's call it ct) at baseline (t0), then at 2 more follow-up (t1 and t2). The interval for follow-up was not uniform across individuals, and the time between t0 and subsequent followups is represented by variable time (a continuous variable). These individuals are assigned into 3 groups based on their comorbidities, age, etc, which is represented by variable group (values 1, 2, 3). Each individual is assigned a number (id). There are covariates but let's simplify things by just putting one for now (covar). I am interested in comparing the change in ct over time between groups
                      Now, further discussions within our team has come to the conclusion that it is of far greater clinical relevance to analyze ct as either a binary variable (0/1) or an ordinal variable. Correspondingly, we will be changing the model used to mixed effects binary or ordinal logistic regression, respectively. In other words, we will be using the following commands:
                      Code:
                      melogit ct i.group##c.time covar || id: i.group i.group#c.time time covar, covariance(unstructured) or
                      Code:
                      meologit ct i.group##c.time covar || id: i.group i.group#c.time time covar, covariance(unstructured) or
                      Now, we would like to quantify the probability of having ct=1 (when analysed as a binary variable) or a higher ct category (when analysed as an ordinal variable) per unit time. May I ask if the following command would achieve this purpose? If not, I would be grateful if you could advise me on how that can be achieved.
                      Code:
                      margins i.group, dydx(time) predict(mu)
                      And then, with time being in years, we would also like to quantify similar probabilities (as above) at 1, 3, and 5 years. May I ask if the following code would be appropriate for this purpose? If not, again your advice would be most appreciated.
                      Code:
                      margins i.group, at(time=1 3 5) predict(mu)
                      Thank you very much in advance and I apologize if the questions above were too basic / could have been figured out on my own -- I tried and am not sure, so I hope you don't mind these questions!!

                      Comment


                      • #12
                        I will refrain from commenting on whether it is wise to convert ct to an ordinal or dichotomous outcome as it is a complicated issue that depends on a number of factual matters that I am ignorant of.

                        The code examples you show in #11 will, if run in the current version of Stata, produce the outputs you are looking for. If you are using some earlier versions of Stata, things might be different, at least for -margins- following -meologit-. This is because at some point back in time -margins- would produce predicted probabilities for only one outcome level at a time, whereas currently it does it for all levels. Version 17 behaves the same way as the current version 18. I don't remember where version 16 stands on this, and I don't remember at which version the change was made. So if you are using version 16 or earlier, run -help meologit_postestimation##margins- to find out.

                        Comment


                        • #13
                          Thank you very much Clyde!

                          Comment


                          • #14
                            Jeffrey Chan I am confused by something you said earlier about the group variable:
                            These individuals are assigned into 3 groups based on their comorbidities, age, etc, which is represented by variable group (values 1, 2, 3).
                            If that is the case, then you cannot include group in the random part of the model since it is a between person (time-invariant) variable. Only time-varying (within person) variables can be specified as random slopes. I would be surprised if that model ran and if it did, I would make sure that the group variable is coded as you expect. The following code is what makes the most sense based on how you have described your data and your research question:
                            Code:
                            melogit ct i.group##c.time covar || id: time , covariance(unstructured) or
                            There may be some justification for allowing the covariates to have random slopes, but if they are not of central interest, then I would not complicate the model by having them as random slopes. Doing so will also increase computational complexity and may lead to estimation problems.

                            Comment


                            • #15
                              Group is time invariant, and cannot by itself have a random slope. But group#c.time is time-varying and can have a random slope. So, yes, you will have to remove i.group from the random slopes, but you can leave i.group#c.time there.

                              Comment

                              Working...
                              X