Multiple imputation takes forever

Sandy Tak

Join Date: Jan 2019

Posts: 6
#1

Multiple imputation takes forever

20 Mar 2019, 12:33

Hello everyone,

I'd like to get some help with multiple imputation. I searched everywhere but couldn't find an answer to this problem I rant into.

I am trying to impute 3 variables using 'chained'

The initial dataset has 22,000 cases and V1 (11% missing), V2 (11% missing), and V3 (10% missing). It took 5 minutes to impute these 3 variables.
Later, I added more cases and it became 28,000 cases and V1 (19.5%), V2 (11.7%) and V3 (12.4%). Now it is taking forever.

Is this just because of the substantially increased missing values for V1 or the overall number?
Does anyone have a solution to this problem or suggestion to impute these variables?

Thank you always!
Tags: None
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

20 Mar 2019, 15:20

You may try the following options: augmented - and -force. You may ‘order’ the missing variables in the command line: from those with less missing data to those with the highest proportion of missing data.

Hopefully that helps.

Best regards,

Marcos
Comment
Sandy Tak

Join Date: Jan 2019

Posts: 6
#3

21 Mar 2019, 11:49

Hi Marcos,

Thanks for your suggestion.

I have read a few posting about MI in this community, many of which asked why it took so long like my question.
After a few hours of running the syntax, the message I got is "convergence not achieved. mlogit failed to converge on observed data."
Now I am trying the method that you suggested for more than an hour and still waiting, but I guess it would not work again.
I feel if it works it should work within like ten minutes. If it runs more than ten minutes, I guess it means there is something wrong.
Then, my question is why. Why the working syntax is not working when I added more cases.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4462
#4

21 Mar 2019, 12:33

sometimes the user-written -ice- command will work when the official -mi- won't (especially if you use the "persist" option); use -search ice- to find and download; if you prefer to analyze using the official commands, there is an -import ice- command just for this purpose
1 like
Comment
daniel klein

Join Date: Mar 2014

Posts: 3850
#5

21 Mar 2019, 15:13

Originally posted by Sandy Tak View Post

I feel if it works it should work within like ten minutes. If it runs more than ten minutes, I guess it means there is something wrong.

No need for guessing; specify option noisily with mi impute.

Originally posted by Rich Goldstein View Post

sometimes the user-written -ice- command will work when the official -mi- won't (especially if you use the "persist" option);

I am still wondering whether this is actually a good thing. The following is pure speculation, but I could imagine that Stata's mi routine stops at some point, e.g., when the amount of augmented ("made up") data is getting too large; imagine adding more pseudo-observations than you have actually observed to get the mode to converge. Perhaps I just find it hard to imagine that programmers at StataCorp cannot solve problems that researchers, who do not write programs for a living, are able to solve (I do not mean to question Patrick Royston's programming skills in any way here). On the other hand, perhaps Stata's mi routine just does not go beyond certain limits that have been explored in the literature where it would be fine to do so.

Best
Daniel

Last edited by daniel klein; 21 Mar 2019, 15:15.
Comment

Announcement

Multiple imputation takes forever

Comment

Comment

Comment

Comment