Hello,
I was hoping someone could help me understand how I should assess which statistical model is best to use, both statistically and visually using mixed effects models.
I am looking at concentrations of a protein, NfL, in different CAG repeat lengths (40-45) across age in Huntington's disease. My data is cross-sectional.
The number of data points for each CAG group is below, some CAG repeat lengths have very small amounts of data points and I was wondering if they needed to be combined potentially if the number of data points is underpowered?
I am unsure how to fit these models to assess if they are well-suited for the data I have and what to plot whether it's the residuals from the mixed effects model or the predicted values from the margins? Is it correct to create a new model for each CAG to plot for each CAG?
I was also wondering if a regression would be better suited for this analysis? As a linear regression model treats each CAG group effect as a fixed effect, estimating a separate coefficient for each group. Whereas mixed effects model treats the CAG group effects as random, allowing for variability across groups and better generalization. How would I decide which would be better to use?
I look forward to any responses and I hope someone can help with some of my questions at least!
Very best wishes,
Annabelle Coleman
I was hoping someone could help me understand how I should assess which statistical model is best to use, both statistically and visually using mixed effects models.
I am looking at concentrations of a protein, NfL, in different CAG repeat lengths (40-45) across age in Huntington's disease. My data is cross-sectional.
The number of data points for each CAG group is below, some CAG repeat lengths have very small amounts of data points and I was wondering if they needed to be combined potentially if the number of data points is underpowered?
- 40 = 4 participants
- 41 = 22 participants
- 42 = 15 participants
- 43 = 9 participants
- 44 = 8 participants
- 45 = 1 participant
- 46 = 2 participants
Code:
/////////controls summarize Age if disease_grp==0, meanonly local min = round(r(min)) //define for the margins estimate local max = round(r(max)) //define for the margins estimate mixed NfL c.Age if disease_grp==0, stddeviations reml est store mixed_control local loglike_control : display %5.3f e(ll) //store log likelihood for control model margins, at(Age=(`min'(0.15)`max')) post est store model_control coefplot (model_control , recast(line) lcolor("`INF_grey'") noci), /// at ytitle("NfL pg/mL controls") xtitle("Age, years") plotregion(lstyle(none)) xlabel(18(10)50) ylabel(0(1)4) name(`model'_control) /// addplot(scatter NfL Age if disease_grp==0 , mcolor("`INF_grey'")) /////////gene expansion carriers summarize CAG if disease_grp==1, meanonly local min_cag = round(r(min)) //define for CAG loop below local max_cag = round(r(max)) //define for CAG loop below forvalues e = `min_cag'/`max_cag' { summarize Age if CAG==`e' & disease_grp==1, meanonly local min = round(r(min)) //define for the margins estimate local max = round(r(max)) //define for the margins estimate mixed NfL c.Age c.CAG if disease_grp==1, stddeviations reml local loglike_`e' : display %5.3f e(lls) //store log likelihood for control model margins, at(Age=(`min'(0.15)`max') CAG=`e') post //predicted margins 15% above/below max and min CAG est store model_cag`e' coefplot (model_cag`e' , recast(line) lcolor("`INF_Red_Light'") noci), /// at ytitle("NfL pg/mL `cag'") xtitle("Age, years") plotregion(lstyle(none)) xlabel(18(10)50) ylabel(0(1)4) /// name(`model'_cag`e', replace) /// addplot(scatter NfL Age if disease_grp==1 & CAG==`e', mcolor("`INF_Red_Light'")) }
I was also wondering if a regression would be better suited for this analysis? As a linear regression model treats each CAG group effect as a fixed effect, estimating a separate coefficient for each group. Whereas mixed effects model treats the CAG group effects as random, allowing for variability across groups and better generalization. How would I decide which would be better to use?
I look forward to any responses and I hope someone can help with some of my questions at least!
Very best wishes,
Annabelle Coleman
Comment