I am working with data from 2 arms (intervention/control) and 3 study waves (v1, v2, v3). I am trying to impute the 23 items of a scale (rather than the summed scale in itself). The missingness is random across the items from what I can tell. Each item has a likert response from 0-4, and the 23 items imputed items will be summed to have a range from 0-92. I was unable to get the mi impute chained ologit to converge, but found a workaround in truncreg which didn't give me ordinal items per se but were within range and usable for the sum. My current code looks like this (I include the overall scale with mean imputed values for those with missingness just to not kick them out of the model, as including the overall scale from my review of literature is recommended):
mi impute chained (truncreg, ll(0) ul(4)) item1 item2 item3....item23 = overall_score educ marstat age food_insecure_i sex, by(arm visit) add(20) replace force rseed (12345)
Here is the issue: in an ideal world, I would use items 2-23 in the prediction model to impute item 1, and so forth. Is there a computationally efficient way to do this? I have two thoughts:
1: If I run mi impute 23 times, would I need to mi export each data set then merge 23 datasets in order to subsequently run mi estimate commands properly? This seems computationally intensive and unnecessary.
and/or
2: These other scale items/potential predictor variables are not all complete (as that is the problem I am dealing with), however the complete values are informative in predicting the missingness for missing folks. However I cannot impute across all the different possible combinations of missingness (for example is someone is missing Q1 and Q3, and I am imputing Q1 and including Q3 for everyone, their value will not get imputed for Q1 since they are missing Q3 and will be kicked out of the predictor model). Is there some specification to avoid this?
I need to keep the by arm and visit specification as the intervention will impact these scores and their changes over time, so I don't necessarily want to use P1V3 to predict P1V1, etc.
I hope this all is clear and I appreciate any advise and guidance anyone can give.
mi impute chained (truncreg, ll(0) ul(4)) item1 item2 item3....item23 = overall_score educ marstat age food_insecure_i sex, by(arm visit) add(20) replace force rseed (12345)
Here is the issue: in an ideal world, I would use items 2-23 in the prediction model to impute item 1, and so forth. Is there a computationally efficient way to do this? I have two thoughts:
1: If I run mi impute 23 times, would I need to mi export each data set then merge 23 datasets in order to subsequently run mi estimate commands properly? This seems computationally intensive and unnecessary.
and/or
2: These other scale items/potential predictor variables are not all complete (as that is the problem I am dealing with), however the complete values are informative in predicting the missingness for missing folks. However I cannot impute across all the different possible combinations of missingness (for example is someone is missing Q1 and Q3, and I am imputing Q1 and including Q3 for everyone, their value will not get imputed for Q1 since they are missing Q3 and will be kicked out of the predictor model). Is there some specification to avoid this?
I need to keep the by arm and visit specification as the intervention will impact these scores and their changes over time, so I don't necessarily want to use P1V3 to predict P1V1, etc.
I hope this all is clear and I appreciate any advise and guidance anyone can give.
Comment