using xtmixed for cluster RCT.....include cluster variable as both fixed and random?

Dana Rose Garfin

Join Date: Apr 2014

Posts: 21
#1

using xtmixed for cluster RCT.....include cluster variable as both fixed and random?

17 Jul 2018, 12:33

I am analyzing an RCT where four sites were randomly assigned to four condition (one site per condition) with four timepoints (baseline, 6 month, 12 month, and 18 month). I am working on specifying my MLM using xtmixed and am a little torn about the best way to handle the treatment condition. It is clustered, but only 1 site per condition.

I am including all covariates that predict baseline differences between the four groups as fixed effects. The question is do I include treatment condition as a fixed effects only? Do I include it as a random effect as well? Or use the cluster option? See below for the various options I am considering.

Option 1 (group as fixed effect only)
xtmixed dv i.group time covariate1 covariate2 covariate3 covariate4 covariate5 covariate6 || Subject:, var reml

Option 2 (group as level 2 random effect, along with Subject)
xtmixed dv i.group time covariate1 covariate2 covariate3 covariate4 covariate5 covariate6 || Subject: group,

Option 3 (group as a random effect on it's own level)
xtmixed dv i.group time covariate1 covariate2 covariate3 covariate4 covariate5 covariate6 || group: || Subject: ,

Option 4 (use cluster command - this option yields estimates most dramatically different from the others)
xtmixed dv i.group time covariate1 covariate2 covariate3 covariate4 covariate5 covariate6 || group: || Subject: , vce(cluster group)

Any advice would be greatly appreciated.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29948
#2

17 Jul 2018, 13:20

You want Option 1.

In option 2, you would be specifying a random slope on the group variable (treated as a continuous-valued variable ranging from 1 through 4, or 0 through 3, or however you coded it.) This makes no sense. Apart from the continuous-valued issue, it also makes no sense because you are saying that every subject has his/her own idiosyncratic response to the treatment he/she is assigned to; you are saying there is an interaction between subject and treatment. But you can't estimate that interaction because each subject only gets one treatment, and within the treatment groups, you can't distinguish that interaction effect from the residual. This model would be unidentified and would not converge.

In option 3, you are treating the subjects as nested in groups, which, in fact, they are. But, problem here is that you have also specified i.group at the fixed level and these two will be competing with each other to capture the impact of group on the outcome. How that competition would shake out is anybody's guess and the results would be uninterpretable. You could have a model with subject nested in group, but omitting i.group from the fixed level. The problem is that you would be, in effect, stating that your four treatments are not four chosen treatments but rather four treatments randomly selected from a hypothetical population of treatments, and you would get an estimate of the outcome variance attributable to group assignment by selection from that population. But you would not get any estimates of the specific effects of those four specific treatments. So this model, while useful in other ways, would not be useful for your purposes.

In option 4, you are taking invalid model 2 and throwing in cluster robust variance estimation, which does nothing to improve its serious underlying flaws. By the way, that is the -vce(cluster ...)- option you specified there, not the -cluster- command. Stata actually also has a -cluster- command which does something totally unrelated to this.

Added: Are you sure you want to specify time as a continuous variable in this analysis? You are thereby stipulating the strong claim that the outcome follows a linear trend over time, and the groups just differ in their starting points at baseline, but otherwise follow parallel lines afterward.

Usually we have no basis for assuming a linear trend in outcome over time (though perhaps the science supports it in your context). Even so, we generally would want a model that does not stipulate that the groups just start out at different points and then evolve in parallel. Usually, in fact, we expect that the groups will move in different ways over time following treatment. So the usual model is something like this:

[code]
mixed outcome i.treatment_group##i.time covariate* || Subject:
[code]

Last edited by Clyde Schechter; 17 Jul 2018, 13:24.
Comment
Dana Rose Garfin

Join Date: Apr 2014

Posts: 21
#3

17 Jul 2018, 13:43

Clyde - thank you so much! I had been using Option 1 in all my preliminary analyses, but then started to second guess myself and went down a little rabbit hole. Good idea to check time as a categorical variable....in the syntax I used I actually checked the interaction between group and time but didn't copy it into my post for some reason. Sorry about that. When I graph the lines the trajectories between groups are all very similar. As a follow-up, do you recommend the reml option or ml? Thanks so much for lending your expertise!!!

Last edited by Dana Rose Garfin; 17 Jul 2018, 13:59.
Comment

Announcement

using xtmixed for cluster RCT.....include cluster variable as both fixed and random?

Comment

Comment