Mis-specified baseline interaction term in mixed linear model?

Alan Niemann

Join Date: Jun 2018

Posts: 5
#1

Mis-specified baseline interaction term in mixed linear model?

01 Jul 2018, 15:27

Dear statalist,

I am working on a mixed linear model with a continuous dependent variable (d100, 0-100) representing a clinical score measured repeatedly on 230 patients, at different intervals (ie. panel data is unbalanced). I am primarily interested in the change in d100 over time according to a categorical variable (cod, 1-3) that I expect to have a major effect, and have fit the following model:

Code:

mixed d100 c.days##i.cod if dcode==1 & baseline>0 & cod!=0 || ID:days, covariance(unstructured) residuals(independent) stddeviations

This gives the following output:

I wanted to go further and establish whether the baseline d100 value at day 0 affects the subsequent slope of decline over days. My issue is that when I include an interaction term for this, the change is much greater than I would have expected - the effect of cod on days is no longer significant, which is a great surprise and to me suggests that when baseline is controlled for, there is no effect of cod on the slope of decline in d100 over days.

Code:

mixed d100 days i.cod##c.days##c.baseline if dcode==1 & baseline>0 & cod!=0 || ID:days, covariance(unstructured) residuals(independent) stddeviations

Am I misinterpreting or mis-specifying this model, or are my findings accurate?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#2

01 Jul 2018, 18:38

You cannot directly compare what appear to be corresponding coefficients in these two models. i.code#c.days (i = 1 or 2) means something different in the first model from what it means in the second model and there is no reason to expect them to be equal, nor nearly equal, nor even of the same sign.

The interpretation of three way interaction models is difficult even for experienced analysts, and it is nearly impossible to do so from the regression output itself. To see if the model has changed appreciably, you really need to use the -margins- command, followed by -marginsplot- to graph predicted values of d100 at a representative range of combinations of baseline and days and all values of cod. When you do that, I suspect you will find that the changes following the introduction of baseline are far less radical than the coefficients (mis)lead you to believe. Do that. If you are seeing radical changes in the graphs, then post back with those results. (There is also a minor issue that, probably due to some missing values of baseline, the sample size is a bit smaller in the second model than in the first--but the number of missing values is apparently small and unlikely to make an important impact.)
Comment
Alan Niemann

Join Date: Jun 2018

Posts: 5
#3

02 Jul 2018, 14:45

Originally posted by Clyde Schechter View Post

You cannot directly compare what appear to be corresponding coefficients in these two models. i.code#c.days (i = 1 or 2) means something different in the first model from what it means in the second model and there is no reason to expect them to be equal, nor nearly equal, nor even of the same sign.

The interpretation of three way interaction models is difficult even for experienced analysts, and it is nearly impossible to do so from the regression output itself. To see if the model has changed appreciably, you really need to use the -margins- command, followed by -marginsplot- to graph predicted values of d100 at a representative range of combinations of baseline and days and all values of cod. When you do that, I suspect you will find that the changes following the introduction of baseline are far less radical than the coefficients (mis)lead you to believe. Do that. If you are seeing radical changes in the graphs, then post back with those results. (There is also a minor issue that, probably due to some missing values of baseline, the sample size is a bit smaller in the second model than in the first--but the number of missing values is apparently small and unlikely to make an important impact.)

Dear Clyde,

Thanks very much for your helpful and swift reply! I have plotted the margins from the second model according to several baseline values and all cod values, which on one graph is difficult to interpret:

When I plot each discrete value of cod against all the baseline values, the mean slope per cod value is unchanged according to baseline - does this imply that the baseline does not have an effect?

However, I'm not sure what I am supposed to compare this to from the model without baseline interaction? I can't plot margins according to baseline in this model for obvious reasons. I can compare the mean slope per cod for each model, as below, in which there doesn't seem to be a huge change (second graph is with baseline interaction):

Finally, is there an to interpret the effect of the baseline interaction on the cod/days effect in terms of a p value? I can derive comparisons of discrete levels of cod on the margins using r.cod, but I'm not sure how to do this with a continuous variable like baseline?

I appreciate there are a number of questions here, but would really appreciate your help or pointing me towards the best way to analyse this/resources.

Many thanks
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#4

02 Jul 2018, 15:45

What I had in mind was to start with the second and third graphs you show, which suggest that introducing baseline has not made much difference. I agree that the first graph is effectively unreadable. You could improve its readability by thinning it out, re-doing it with fewer values of cent_baseline, say just 15, 45, and 90 to get a look at the impact of having baseline in the model when its values are extreme, or nearly dead-center. If what you see is that the lines with 15, 45, and 90 are more or less parallel to each other, but separated vertically, then that would suggest that a model introducing baseline, but not interacting it, would be appropriate.

As for a statistical test, you can test the effect of introducing baseline into the model at all with the following

Code:

test baseline cod#c.baseline c.days#c.baseline cod#c.days#c.baseline

If you just want to test the effect of interacting baseline with your initial term of interest, (cod#c.days), you could do that just be reading the results in the cod#c.days#c.baseline row of the regression output itself. (Advice offered reluctantly, as I am usually loathe to do model selection based on p-values of any kind.)

In any case, the original question, as I understood it, was not about model selection but trying to understand what, if anything, the shifts in some of the coefficients meant. The point of the graphs was to focus on the model predictions, which matter directly, instead of the coefficients, whose variations between models are nearly impossible to interpret because the same coefficient means different things in different models.
Comment

Announcement

Mis-specified baseline interaction term in mixed linear model?

Comment

Comment

Comment