Multivariate multiple imputation - Imputation versus analytic model

CEdward

Join Date: Nov 2014

Posts: 131
#1

Multivariate multiple imputation - Imputation versus analytic model

30 May 2020, 08:03

Hello all,

I have the following data for which I am trying to impute the missing values for two outcome variables y and z. The design is a three-level hierarchical design that involves repeated measures (level 1) of individuals (level 2) in clusters (level 3) at discrete time intervals during the study.

Code:

id time y z cluster intervention 1 1 0.5 0.23 1 0 1 2 . 0.11 1 1 1 3 . . 1 1 2 1 0.15 . 2 0 2 2 0.05 0.05 2 0 2 3 . . 2 1 3 1 0.90 0.90 1 0 3 2 0.23 0.81 1 1 3 3 0.22 0.22 1 1

To impute this hierarchical data, it is recommended that the data first be transformed from long to wide format (https://www.stata.com/support/faqs/s...and-mi-impute/) under strategy 3: use a multivariate normal model to impute all clusters simultaneously. I have been able to do that successfully like so

Code:

reshape wide y z intervention, i(id) j(time) string

Code:

id y3 z3 intervention3 y2 z2 intervention2 y1 z1 intervention1 cluster 1 . . 1 . .11 1 .5 .23 0 1 2 . . 1 .05 .05 0 .15 . 0 2 3 .45 . 1 .23 .81 1 .9 .9 0 1

My questions are as follows:

1. The link shows an example in strategy 3 of a two-level model, but I have a three-level model. Would it be correct in principle to reshape the data twice and then impute? If so, how would I reshape a second time?

2. Once I reshaped the data it is evident that each variable in the dataset (except for id and cluster) was multiplied by 3 to reflect the three time points (e.g. y1, y2, y3). The code for my imputation should look like so.

Code:

mi set wide mi register imputed y1 y2 y3 z1 z2 z3 mi impute mvn y1 y2 y3 z1 z2 z3 intervention1 intervention2 intervention3 cluster id, add(100) noisily --Fixed effects for cluster and id added here because they are added as random effects parameters in the analytic model mi reshape long y z intervention, i(id) j(time) string mi estimate: mixed y intervention time ||cluster: ||id: mi estimate: mixed z intervention time ||cluster: ||id:

In theory the imputation model and the analytic model should contain the same variables (including the dependent variable). However, I need to include time as a fixed effect variable in my analytic model with mi estimate above which has evidently been 'removed' during the reshaping process. How do I reconcile this?

Observe as well that in the analytic model I entered a single variable for y, intervention, and time, but in the imputation model (mi impute) I have each of these variables repeated 3 times (e.g. y1 y2, y3). Is this permissible?

Is it okay to run the mi estimate command twice since I have two outcome variables?

Thanks!

Last edited by CEdward; 30 May 2020, 08:06.
Tags: None

Announcement

Multivariate multiple imputation - Imputation versus analytic model