Hi,
I would love some insight on running linear regressions.
I am working with some pre-post intervention data and I would like to run a regression using change in physical function (SPPB; baseline-12 months) as my predictor variable and change in physical health-related quality of life (PCS; baseline-12 months) as my outcome variable while adjusting for age, sex, race, weight change (baseline-12 months), and treatment group (phone or newsletter). Initially, I just ran a multiple linear regression using "change" variables, i.e. change in physical function = 12 month SPPB score - baseline SPPB score and change in physical health-related quality of life = 12 month PCS score - baseline PCS score. However, I have recently learned that I may need to run mixed-effects linear models since I am working with pre-post intervention data, which multiple linear regression may not be suitable for.
My first question is: should I be running a mixed-effect linear model or is multiple linear regression fine for this analysis?
If I should be running a mixed-effect linear model, how do I include independent variables, follow-up contact condition (phone, newsletter), visit (baseline, 12 months), and contact condition by visit interaction as fixed effects and subject as a random effect, and adjust for age, sex, and race, in Stata? Is this an appropriate set-up of the mixed-effect linear model for evaluating the relationship between change in physical function (predictor) and change in physical health-related quality of life (outcome variable) from baseline to 12 months?
Here is the data I am working with:
My original multiple linear regression code was this:
.
Another question, if the multiple linear regression is fine, should I include baseline PCS score as a covariate?
Thank you for any help.
I would love some insight on running linear regressions.
I am working with some pre-post intervention data and I would like to run a regression using change in physical function (SPPB; baseline-12 months) as my predictor variable and change in physical health-related quality of life (PCS; baseline-12 months) as my outcome variable while adjusting for age, sex, race, weight change (baseline-12 months), and treatment group (phone or newsletter). Initially, I just ran a multiple linear regression using "change" variables, i.e. change in physical function = 12 month SPPB score - baseline SPPB score and change in physical health-related quality of life = 12 month PCS score - baseline PCS score. However, I have recently learned that I may need to run mixed-effects linear models since I am working with pre-post intervention data, which multiple linear regression may not be suitable for.
My first question is: should I be running a mixed-effect linear model or is multiple linear regression fine for this analysis?
If I should be running a mixed-effect linear model, how do I include independent variables, follow-up contact condition (phone, newsletter), visit (baseline, 12 months), and contact condition by visit interaction as fixed effects and subject as a random effect, and adjust for age, sex, and race, in Stata? Is this an appropriate set-up of the mixed-effect linear model for evaluating the relationship between change in physical function (predictor) and change in physical health-related quality of life (outcome variable) from baseline to 12 months?
Here is the data I am working with:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long pid byte(sppbscore0 sppbscore12) float(sppb_ch_12mo PCS_score1 PCS_score3 PCS_ch_12mo wt1_mean wt12mo_mean wt_ch_12mo sex_r age_screen race_wnw) long assignment_r 10002 5 6 1 50.7 57.58 6.880001 93.45 79 -14.449997 0 65.41 0 2 10005 4 7 3 39.31 47.52 8.209999 67.6 63.45 -4.1499977 0 72.72 0 1 10008 11 10 -1 57.4 57.4 0 68.05 66.350006 -1.699997 0 74.42 0 2 10011 9 7 -2 50.87 45.19 -5.68 86.5 82.9 -3.5999985 1 78.44 0 2 10012 11 11 0 47.36 41.03 -6.330002 92 79.6 -12.400002 0 67.15 0 2 10013 7 5 -2 41.81 49.8 7.989998 111 89.3 -21.699997 0 71.13 0 1 10015 10 8 -2 55.69 52.68 -3.009998 100.5 94.3 -6.199997 0 65.43 0 2 10017 4 6 2 42.87 53.94 11.07 95.05 91.5 -3.550003 1 80.66 0 2 10024 10 9 -1 50.92 59.59 8.670002 106.1 107.25 1.1500015 0 65.35 1 2 10025 7 8 1 47.1 51.93 4.830002 109.05 96.3 -12.75 1 73.17 0 1 10026 10 11 1 46.81 61.08 14.27 84.7 76.6 -8.099998 0 69.09 0 2 10031 9 7 -2 38.36 39.03 .6699982 94.6 87.4 -7.199997 0 70.95 0 1 10033 9 9 0 57.4 53.92 -3.480003 93.64999 90.75 -2.899994 1 70.81 0 2 10044 6 6 0 53.86 52.38 -1.4799995 82.6 72.3 -10.299995 1 78.11 0 2 10047 11 8 -3 59.6 58.38 -1.2199974 78.75 72.9 -5.849998 0 66.85 1 2 10048 8 8 0 54.64 54.64 0 84.3 84.1 -.20000458 0 70.77 1 1 10054 9 9 0 57.12 41.75 -15.37 72.8 67.1 -5.700005 0 65.81 0 2 10055 11 10 -1 58.38 47.94 -10.440002 111.85 108.3 -3.550003 1 65.28 0 1 10058 8 6 -2 43.45 46.68 3.2299995 93.8 89.8 -4 0 69.4 1 2 10060 10 9 -1 51.28 51 -.27999878 70 64.95 -5.050003 0 66.33 0 2 10063 10 6 -4 49.81 21.66 -28.15 102.15 98.8 -3.349991 1 68.62 0 1 10066 11 11 0 53.5 53.5 0 89.8 90.1 .2999954 0 69.22 0 1 10068 12 11 -1 53.53 53.7 .170002 101.2 94.5 -6.699997 1 66.28 0 1 10069 10 9 -1 43.74 56.13 12.39 71.9 66.6 -5.300003 0 71.99 0 1 10071 10 11 1 42.75 53.33 10.580002 131 106.8 -24.199997 1 66.11 0 2 10077 7 6 -1 48.52 52.3 3.779999 94.7 86.2 -8.5 0 68.39 0 2 10079 10 10 0 52.51 58.89 6.380001 73.649994 66.1 -7.549995 0 72.74 0 2 10081 7 8 1 46.95 49.04 2.0900002 90.7 82.8 -7.899994 0 71.04 1 1 10085 10 8 -2 54.91 58.62 3.709999 85.2 79.85001 -5.349991 0 68.05 1 1 10087 6 7 1 26.1 22.69 -3.41 89.95 75.8 -14.149994 0 73.76 0 1 end label values sex_r Sex label def Sex 0 "Female", modify label def Sex 1 "Male", modify label values race_wnw race label def race 0 "White", modify label def race 1 "Non-white", modify label values assignment_r assignment_r label def assignment_r 1 "Newsletter", modify label def assignment_r 2 "Phone", modify
My original multiple linear regression code was this:
Code:
regress PCS_ch_12mo sppb_ch_12mo age_screen sex_r race_wnw assignment_r wt_ch_12mo
Another question, if the multiple linear regression is fine, should I include baseline PCS score as a covariate?
Thank you for any help.
Comment