Greetings,
INTRO: I am working with a longitudinal dataset. The study from which the data was derived was an RCT evaluating a program. There were 6 separate data collection periods that took place over 18 months. There was a lot of attrition in the study; so, I multiply imputed the data using stata. The study uses a social science framework; so there are observed variables (imputed) and latent (passive) scale variables in the dataset. I imputed the observed variables and calculated the passive variables based on the imputed variables.
QUESTION 1: I am struggling with how to setup the format of the imputed dataset. Technically, the original dataset is in the wide format (e.g. each row contained all the information from all six longitudinal surveys per individual). However, right now, the dataset is in the long format, where the new iteration of the imputed, longitudinal dataset is appended to the bottom of the previous iteration of the imputed, longitudinal dataset. There is a variable that identifies which cases belong to which imputation. Is this the best format for a multiply imputed longitudinal dataset?
QUESTION 2: I need to run some path models on the imputed data (e.g. latent growth curve models; latent class analyses; etc). As far as I know, the mi estimates command is not going to work with the SEM builder. So how do I let stata know that I am working with an imputed dataset? Do I just run the SEM models with different groups based on the imputation variable? Do I run the SEM models on each dataset individually and then manually calculate the pooled estimates? Any thoughts are appreciated.
​Thanks in advance,
Sam
INTRO: I am working with a longitudinal dataset. The study from which the data was derived was an RCT evaluating a program. There were 6 separate data collection periods that took place over 18 months. There was a lot of attrition in the study; so, I multiply imputed the data using stata. The study uses a social science framework; so there are observed variables (imputed) and latent (passive) scale variables in the dataset. I imputed the observed variables and calculated the passive variables based on the imputed variables.
QUESTION 1: I am struggling with how to setup the format of the imputed dataset. Technically, the original dataset is in the wide format (e.g. each row contained all the information from all six longitudinal surveys per individual). However, right now, the dataset is in the long format, where the new iteration of the imputed, longitudinal dataset is appended to the bottom of the previous iteration of the imputed, longitudinal dataset. There is a variable that identifies which cases belong to which imputation. Is this the best format for a multiply imputed longitudinal dataset?
QUESTION 2: I need to run some path models on the imputed data (e.g. latent growth curve models; latent class analyses; etc). As far as I know, the mi estimates command is not going to work with the SEM builder. So how do I let stata know that I am working with an imputed dataset? Do I just run the SEM models with different groups based on the imputation variable? Do I run the SEM models on each dataset individually and then manually calculate the pooled estimates? Any thoughts are appreciated.
​Thanks in advance,
Sam
Comment