I could really use some help whether I'm on the right track with 1) the model I've selected, and 2) the interpretation of the model output.
I want to analyse the associations between respiratory function at follow-up and the variables age, gender, smoking, and baseline respiratory function at follow-up. Specifically for a group of patients with a specific lung disease.
I have a theory that the decline in lung function over the years for this patient groups can be explained by time (because the older you get --> the worse lung function).
Each patient have baseline lung function measure, and then 1-3 follow up visits, where they measure the lung function again.
Example of my data:
I used a linear mixed effect model:
And got the following output:
I interpret this as the following:
Fixed effects (estimated effects across all patients):
- Smoking, sex, and age do not show a significant effect on lung function (however, age is borderline significant).
- The time variable (days since baseline) is significant: For each day since baseline, lung function decreases by 0.004 units. I.e. lung function declines over time.
Random effects (the variability within and between patients (record id)):
- The random slope (days since baseline) coefficient is basically 0, meaning there is no variability in how lung function changes over time across patients. I.e. the rate of lung function decline is similar across the patients.
- The random intercept (recordid): there is a substantial variation in the baseline lung function level between patients. I.e. the patients baseline lung function measures were not similar.
- Residual variance = 32.4. This means that approx 30% of the variation in lung function is not explained by the fixed and random effects (i.e. not explained by the variabels included in the model).
I want to analyse the associations between respiratory function at follow-up and the variables age, gender, smoking, and baseline respiratory function at follow-up. Specifically for a group of patients with a specific lung disease.
I have a theory that the decline in lung function over the years for this patient groups can be explained by time (because the older you get --> the worse lung function).
Each patient have baseline lung function measure, and then 1-3 follow up visits, where they measure the lung function again.
Example of my data:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str4 recordid float(baseline_date date days_since_baseline) byte(sex age smoking) "1" 21728 21728 0 1 62 0 "1" 21728 22719 991 1 62 0 "10" 20841 20841 0 1 34 0 "10" 20841 21577 736 1 34 0 "10" 20841 22305 1464 1 34 0 end format %td baseline_date format %td date
I used a linear mixed effect model:
Code:
mixed lung_function smoking sex age days_since_baseline || recordid: days_since_baseline
And got the following output:
Code:
Performing EM optimization ... Performing gradient-based optimization: Iteration 0: Log likelihood = -434.99454 Iteration 1: Log likelihood = -434.38122 Iteration 2: Log likelihood = -434.36409 Iteration 3: Log likelihood = -434.36264 Iteration 4: Log likelihood = -434.36251 Iteration 5: Log likelihood = -434.36251 Computing standard errors ... Mixed-effects ML regression Number of obs = 118 Group variable: recordid Number of groups = 49 Obs per group: min = 1 avg = 2.4 max = 4 Wald chi2(4) = 20.68 Log likelihood = -434.36251 Prob > chi2 = 0.0004 ------------------------------------------------------------------------------------------ lung_function | Coefficient Std. err. z P>|z| [95% conf. interval] -------------------------+---------------------------------------------------------------- smoking | -2.62267 2.20514 -1.19 0.234 -6.944665 1.699325 sex | -.3625873 3.859876 -0.09 0.925 -7.927806 7.202632 age | -.2484607 .1275995 -1.95 0.052 -.4985513 .0016298 days_since_baseline | -.0040639 .0010194 -3.99 0.000 -.006062 -.0020658 _cons | 113.3245 7.138893 15.87 0.000 99.33253 127.3165 ------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects parameters | Estimate Std. err. [95% conf. interval] -----------------------------+------------------------------------------------ recordid: Independent | var(days_since_baseline) | 2.69e-13 3.76e-10 0 . var(_cons) | 158.1431 34.86357 102.6592 243.6142 -----------------------------+------------------------------------------------ var(Residual) | 32.42914 5.543997 23.19617 45.33718 ------------------------------------------------------------------------------ LR test vs. linear model: chi2(2) = 89.32 Prob > chi2 = 0.0000 Note: LR test is conservative and provided only for reference. .
I interpret this as the following:
Fixed effects (estimated effects across all patients):
- Smoking, sex, and age do not show a significant effect on lung function (however, age is borderline significant).
- The time variable (days since baseline) is significant: For each day since baseline, lung function decreases by 0.004 units. I.e. lung function declines over time.
Random effects (the variability within and between patients (record id)):
- The random slope (days since baseline) coefficient is basically 0, meaning there is no variability in how lung function changes over time across patients. I.e. the rate of lung function decline is similar across the patients.
- The random intercept (recordid): there is a substantial variation in the baseline lung function level between patients. I.e. the patients baseline lung function measures were not similar.
- Residual variance = 32.4. This means that approx 30% of the variation in lung function is not explained by the fixed and random effects (i.e. not explained by the variabels included in the model).
Comment