Adjusting for baseline level in a mixed model

Jon Herskind

Join Date: Apr 2024

Posts: 2
#1

Adjusting for baseline level in a mixed model

03 Apr 2024, 15:57

Hi Stata experts.

I have tried searching for solutions to my question on the forum, but I cannot find something that applies to my situation (to my limited understanding). I have a dataset consisting of two groups with measurements at four timepoints. The outcome is continuous, and the (sample) data and model is the following:

muscle time group force200hz
1 2 1 955.71009
1 3 1 959.32508
1 4 1 965.15919
2 2 1 950.00694
2 3 1 953.19907
2 4 1 937.6988
3 2 1 1084.5172
3 3 1 1081.1268
3 4 1 1070.8881

Code:
mixed force200hz time#group || muscle:, nobase nocons residual(un, t(time)) reml dfmethod(kroger)

Now, by chance the two groups have a markedly different baseline (timepoint 1) level of the outcome variable, and I'm concerned that this disturbs my analyses and interpretability of the results. Namely, this makes it hard to compare the groups directly at each timepoint (e.g. using pwcompare), or to simply compare the change from baseline (e.g. using lincom). In fact, differences in changes over time between the groups may be as small as 0.2 % and still get detected as significant. I suspect that this is because of the greater absolute decreases in the group with the greater baseline level. From other posts on this forum, I read that I might want to include the baseline level as a covariate, deleting timepoint 1 from the outcome variable, and creating a new column for the baseline level:

muscle time group force200hz baseline200hz
1 2 1 955.71009 992.21542
1 3 1 959.32508 992.21542
1 4 1 965.15919 992.21542
2 2 1 950.00694 997.53312
2 3 1 953.19907 997.53312
2 4 1 937.6988 997.53312
3 2 1 1084.5172 1111.1094
3 3 1 1081.1268 1111.1094
3 4 1 1070.8881 1111.1094

Code:
mixed force200hz time#group baseline200hz || muscle:, nobase nocons residual(un, t(time)) reml dfmethod(kroger)

This approach still finds some of the same clinically non-relevant differences in change over time of 0.2-0.4 % significant in a quite small dataset (n=6 in both groups), which I have a hard time accepting and explaining. For reference, I am usually looking at differences between 10-100 % in these types of analyses.

Am I on a goose chase, or is there another way to account for the differences at baseline? Should I simply accept these observed differences?
Tags: None
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2371
#2

03 Apr 2024, 18:51

I’m not at my computer so I can’t work with your data, but it’s missing your second group anyway. For starters the structure of your linear model is misspecified because only the interaction term is included in your models, but neither of the component “main” effects are included. Review -help fvvarlist- to see how you properly used factor notation and interactions. To cut to the chase, this should be

Code:

i.group##i.time

Second, if you’re going to run an MMRM, why not put your baseline measure in the outcome vector? That is, include it as an outcome.

The model will now model in the more familiar way a saturated mean structure for each group by time combination. Now your interaction terms represent the group mean difference in the change from baseline.
1 like
Comment

Joseph Coveney

Join Date: Apr 2014
Posts: 4352

04 Apr 2024, 03:22

Originally posted by Jon Herskind View Post

. . . differences in changes over time between the groups may be as small as 0.2 % and still get detected as significant. I suspect that this is because of the greater absolute decreases in the group with the greater baseline level.

[ANCOVA] still finds some of the same clinically non-relevant differences in change over time of 0.2-0.4 % significant in a quite small dataset (n=6 in both groups), which I have a hard time accepting and explaining. For reference, I am usually looking at differences between 10-100 % in these types of analyses.

From the tiny snippet of data that you do post, you have an astonishing intraclass correlation coefficient of 0.99, but the culprit is more likely model misspecification as Leonardo suggests: taking the observed between-subjects and within-subjects variances as given, even a 10% difference between groups in the changes over time result in less than an 8% null hypothesis rejection rate with that sample size, not much more than the nominal Type I error rate. Code highlights below; complete do-file and log file attached.

Code:

version 18.0

clear *

quietly input byte(muscle time group) double(force200hz baseline200hz)
<redacted for brevity>
end

rename muscle mid
rename time tim
rename group grp
rename force200hz out

quietly reshape wide out, i(mid) j(tim)
rename baseline200hz out1
// Small degree of autocorrelation, but compound symmetry assumption probably OK
correlate out1 out2-out4, covariance
quietly reshape long out, i(mid) j(tim)

xtreg out i.tim, i(mid) fe

*
* Begin here
*
// seedem
set seed 1718939513

tempname B
matrix define `B' = e(b)

tempname sigma_e sigma_u
scalar define `sigma_e' = e(sigma_e)
scalar define `sigma_u' = e(sigma)

program define simEm, rclass
    version 18.0
    syntax , b(name) sigma_e(name) sigma_u(name)

    drop _all
    quietly set obs 12
    generate byte mid = _n
    generate byte grp = mod(_n, 2)
    generate double out1 = rnormal(`b'[1, 5], `sigma_u')
    forvalues t = 2/4 {
        generate double out`t' = out1 + `b'[1, `t'] * cond(grp, 1.1, 1)
    }
    quietly reshape long out, i(mid) j(tim)
    quietly replace out = rnormal(out, `sigma_e')
    mixed out i.grp##i.tim || mid: , reml dfmethod(satterthwaite) nolrtest nolog
    contrast grp#tim, small
    return scalar p = r(p)[1, 1]
end

quietly simulate p = r(p), double reps(400): simEm , b(`B') ///
    sigma_e(`sigma_e') sigma_u(`sigma_u')
assert !mi(p)
generate byte pos = p < 0.05
tabulate pos

exit

Attached Files

Comment

Jon Herskind

Join Date: Apr 2024

Posts: 2
#4

04 Apr 2024, 14:26

Thank you both for your replies. My apologies for an oversight: the number of muscles in each group is n=6 for group 1 and n=4 for group 2. I have attached the datafile here, both with the baseline measurement as a part of the outcome variable and as a separate variable. I also previously had the model with the interaction term as time##group with similar results in my lincom commands.

Second, if you’re going to run an MMRM, why not put your baseline measure in the outcome vector? That is, include it as an outcome.

As I wrote earlier, I found an earlier post (and this one) with a sort of similar issue where it was recommended to not have the baseline value as a part of the outcome variable and instead include it as a separate covariate variable.

Joseph: Thank you for the elaborate response. I am not familiar with this type of simulation, and there are a few lines of code I don't fully understand. As I understand it, you run the mixed model:

Code:

out i.grp##i.tim || mid: , reml dfmethod(satterthwaite) nolrtest nolog

which is similar to what Leonardo suggested, and then contrasts the group at each timepoint. As can be seen from the attached dataset, the force at timepoint 1 (baseline) is quite imbalanced with ~944 in one group and ~743 units in the other. As I am mostly interested in not the contrast at different timepoints, but rather the development over time, I have previously used lincom, which then runs into problems, because its testing the absolute differences. I was hoping I could make the model consider the large difference in force at baseline and assess whether the differences over time are "real" or simply a product of a higher baseline level, whether baseline level can be seen as a confounder or something else. I hope that makes sense, and I hope I didn't misunderstand your replies. Thanks again.

Attached Files

Muscle force data.xlsx (12.2 KB, 1 view)
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4352
#5

05 Apr 2024, 03:42

Originally posted by Jon Herskind View Post

I was hoping I could make the model consider the large difference in force at baseline and assess whether the differences over time are "real" or simply a product of a higher baseline level, whether baseline level can be seen as a confounder or something else.

I think that you're going to need to look to physiology and not statistics for answers to those questions.

Adjustment via ANCOVA or constrained longitudinal data analysis (cLDA) I believe is recommended only for randomized controlled trials where the expectation is warranted that baseline values on the outcome measure will be the same.

From the data in the worksheet that you attached, it's clear that Group 1 starts at a higher level of force (millinewtons?) and drops to a greater extent over successive observation intervals. You can make that difference between groups in the timecourse go away to some extent by computing the change scores as percentages of the baseline value prior to fitting a regression model to them, but whether that elucidates or obfuscates what's really going on I suspect isn't answerable in terms of whether the resulting p-values now fail to reach some threshold of statistical significance. In the case of your data, even going that route (i.e., fitting a linear mixed model to percent-of-baseline change scores) there remain clear differences between the two groups in the marginsplot.

My suggestion is to go ahead and fit the most straightforward model to your limited data, which is the one that Leonardo mentions and the one that I use in my simulations. It is suitable regardless of whether your data are from a randomized controlled trial or are from a pilot study with group membership not under experimenter's control. You can use the marginsplot descriptively as a summary of the study's results. Interpretation of the graph will have to defer to subject matter knowledge.

(". . .a 10% difference between groups in the changes over time results in . . .")
Comment

Announcement