Hello all,
I have the following data for which I am trying to impute the missing values for two outcome variables y and z. The design is a three-level hierarchical design that involves repeated measures (level 1) of individuals (level 2) in clusters (level 3) at discrete time intervals during the study.
To impute this hierarchical data, it is recommended that the data first be transformed from long to wide format (https://www.stata.com/support/faqs/s...and-mi-impute/) under strategy 3: use a multivariate normal model to impute all clusters simultaneously. I have been able to do that successfully like so
My questions are as follows:
1. The link shows an example in strategy 3 of a two-level model, but I have a three-level model. Would it be correct in principle to reshape the data twice and then impute? If so, how would I reshape a second time?
2. Once I reshaped the data it is evident that each variable in the dataset (except for id and cluster) was multiplied by 3 to reflect the three time points (e.g. y1, y2, y3). The code for my imputation should look like so.
In theory the imputation model and the analytic model should contain the same variables (including the dependent variable). However, I need to include time as a fixed effect variable in my analytic model with mi estimate above which has evidently been 'removed' during the reshaping process. How do I reconcile this?
Observe as well that in the analytic model I entered a single variable for y, intervention, and time, but in the imputation model (mi impute) I have each of these variables repeated 3 times (e.g. y1 y2, y3). Is this permissible?
Is it okay to run the mi estimate command twice since I have two outcome variables?
Thanks!
I have the following data for which I am trying to impute the missing values for two outcome variables y and z. The design is a three-level hierarchical design that involves repeated measures (level 1) of individuals (level 2) in clusters (level 3) at discrete time intervals during the study.
Code:
id time y z cluster intervention 1 1 0.5 0.23 1 0 1 2 . 0.11 1 1 1 3 . . 1 1 2 1 0.15 . 2 0 2 2 0.05 0.05 2 0 2 3 . . 2 1 3 1 0.90 0.90 1 0 3 2 0.23 0.81 1 1 3 3 0.22 0.22 1 1
Code:
reshape wide y z intervention, i(id) j(time) string
Code:
id y3 z3 intervention3 y2 z2 intervention2 y1 z1 intervention1 cluster 1 . . 1 . .11 1 .5 .23 0 1 2 . . 1 .05 .05 0 .15 . 0 2 3 .45 . 1 .23 .81 1 .9 .9 0 1
1. The link shows an example in strategy 3 of a two-level model, but I have a three-level model. Would it be correct in principle to reshape the data twice and then impute? If so, how would I reshape a second time?
2. Once I reshaped the data it is evident that each variable in the dataset (except for id and cluster) was multiplied by 3 to reflect the three time points (e.g. y1, y2, y3). The code for my imputation should look like so.
Code:
mi set wide mi register imputed y1 y2 y3 z1 z2 z3 mi impute mvn y1 y2 y3 z1 z2 z3 intervention1 intervention2 intervention3 cluster id, add(100) noisily --Fixed effects for cluster and id added here because they are added as random effects parameters in the analytic model mi reshape long y z intervention, i(id) j(time) string mi estimate: mixed y intervention time ||cluster: ||id: mi estimate: mixed z intervention time ||cluster: ||id:
Observe as well that in the analytic model I entered a single variable for y, intervention, and time, but in the imputation model (mi impute) I have each of these variables repeated 3 times (e.g. y1 y2, y3). Is this permissible?
Is it okay to run the mi estimate command twice since I have two outcome variables?
Thanks!