Apologies for the long email. Sorry to say so, but I am trying to replicate some SAS code in Stata (trying to teach some students that Stata can do all SAS can wrt to our desired models).
Our data were generated from a group-randomized trial where students are nested within schools, and schools are nested within treatment condition such that a whole school is randomized to either treatment or control condition. There are two time points, pre and post treatment. The outcome variable is math score.
One may view the data in two ways. First is what we call a member cross-section. Here we do NOT link a particular student at baseline (time=0) to their assessment at follow-up (time=1). Instead we assume the schools are the same but students within a school change, either due to sampling or perhaps graduation (e.g, senior class in 2011 and 2012).
Apart for some formatting and other details (e.g., ML v REML), the well-understood SAS code is
* member XS analysis;
proc mixed;
class cond school time;
model math = cond time cond*time /s;
random int time/subject=school(cond);
run;
Basically this simply regresses the outcome variable, math, on condition and time effects, and their interaction. We have a random effect for schools nested within condition, and time. We treat cond, school, and time as factor variables.
In Stata (v13), the following will yield identical results save for degrees of freedom for parm estimates. For what it’s worth, Stata relies in the Z distribution but we rarely have so many groups/cluster in a public health interventions. Accordingly, one must use a post-estimation command such as lincom or margins to manually specify degrees of freedom for effect estimates, for example df(17). Regardless, the paramater estimates are correct if we type
* member XS analysis
mixed math cond##time || school: ||time:, reml
It’s the second approach to the data that is vexing me in Stata. Here we DO want to link a particular person from baseline to follow—up. We call this a member cohort analysis. Again, the relevant SAS code is
* Member cohort approach #1;
proc mixed;
class cond school time;
format time timef. cond condf.;
model math = cond time cond*time /s;
random int time/subject=school(cond);
random int/subject=id(school*cond) ;
run;
Would could also run the model this way, which will yield identical results
* Member cohort approach #2;
proc mixed;
class cond school time;
format time timef. cond condf.;
model math = cond time cond*time /s;
random int time/subject=school(cond);
repeated time/subject=id(school*cond) type=cs;
run;
Notice the only change is the extra line at bottom which tells SAS that subjects, denoted by id, have repeated observations, and are nested within schools which are nested in condition.
I’ve searched wide and far and did a bunch of trial and error. I cannot figure out how to get these results from Stata.
How can I get mixed (or xtmixed) to recognize repeated observations on particular subjects over time and recognize that such subjects are nested within schools and condition?
Thanks in advance - Michael (UMN Epidemiology)
Our data were generated from a group-randomized trial where students are nested within schools, and schools are nested within treatment condition such that a whole school is randomized to either treatment or control condition. There are two time points, pre and post treatment. The outcome variable is math score.
One may view the data in two ways. First is what we call a member cross-section. Here we do NOT link a particular student at baseline (time=0) to their assessment at follow-up (time=1). Instead we assume the schools are the same but students within a school change, either due to sampling or perhaps graduation (e.g, senior class in 2011 and 2012).
Apart for some formatting and other details (e.g., ML v REML), the well-understood SAS code is
* member XS analysis;
proc mixed;
class cond school time;
model math = cond time cond*time /s;
random int time/subject=school(cond);
run;
Basically this simply regresses the outcome variable, math, on condition and time effects, and their interaction. We have a random effect for schools nested within condition, and time. We treat cond, school, and time as factor variables.
In Stata (v13), the following will yield identical results save for degrees of freedom for parm estimates. For what it’s worth, Stata relies in the Z distribution but we rarely have so many groups/cluster in a public health interventions. Accordingly, one must use a post-estimation command such as lincom or margins to manually specify degrees of freedom for effect estimates, for example df(17). Regardless, the paramater estimates are correct if we type
* member XS analysis
mixed math cond##time || school: ||time:, reml
It’s the second approach to the data that is vexing me in Stata. Here we DO want to link a particular person from baseline to follow—up. We call this a member cohort analysis. Again, the relevant SAS code is
* Member cohort approach #1;
proc mixed;
class cond school time;
format time timef. cond condf.;
model math = cond time cond*time /s;
random int time/subject=school(cond);
random int/subject=id(school*cond) ;
run;
Would could also run the model this way, which will yield identical results
* Member cohort approach #2;
proc mixed;
class cond school time;
format time timef. cond condf.;
model math = cond time cond*time /s;
random int time/subject=school(cond);
repeated time/subject=id(school*cond) type=cs;
run;
Notice the only change is the extra line at bottom which tells SAS that subjects, denoted by id, have repeated observations, and are nested within schools which are nested in condition.
I’ve searched wide and far and did a bunch of trial and error. I cannot figure out how to get these results from Stata.
How can I get mixed (or xtmixed) to recognize repeated observations on particular subjects over time and recognize that such subjects are nested within schools and condition?
Thanks in advance - Michael (UMN Epidemiology)
Comment