I will start this somewhat lenghty post by repeating my earlier requests on the Wishlist for Stata 17 and the Wishlist for Stata 16.
My problem is with the behavior of mi impute chained when one of the models, usually mlogit, fails to converge. If that happens, Stata will exit with an error message and, more relevant to this post, discard all imputed values that have been added so far. This behavior is consistent with Stata's philosophy of "doing it all or doing nothing at all". It is also useful if there is something wrong with the imputation model that we should fix. The behavior is, however, frustrating if the model in question fails to converge in, say, iteration 7 on m=42. By then, the respective model has successfully converged 416 times (assuming the default burin-in) before it failed -- once. Chances are, there is no systematic problem with that model; chances are, the model will converge again in iteration 8 on m=42.
I argue that stopping the imputation process altogether because one of the models fails to converge once is not only frustrating but also leads to worse imputation models in practice. The reason is that confronted with the described problem, we, as users, are left with one choice: modify the respective model. There are different ways of modifying the model, such as omitting predictors, change (collapse) some categories of the outcome, or use a different model, e.g., pmm. Neither of these modifications is desirable and all of them will necessarily affect all iterations in all imputations, thus, making the imputation model worse. Instead of affecting all iterations in all imputations, I would rather be able to skip the one iteration in which the model happens not to converge.
The community-contributed ice command (Royston, SSC, SJ) offers a persist option that ignores errors, such as non-convergence. It would be even better if we could specify which errors we are willing to ignore and how often we are willing to ignore them. Still, this option is something that StataCorp should seriously consider borrowing. Personally, I trust ice but I am just a little bit more comfortable with Stata's mi suit. Therefore, I have written a crude workaround wrapper for mi impute that persists in case of non-convergence. Here is an example, using a modified version of auto.dta:
The modification above lead to convergence issues when we impute missing values for rep78 with mlogit:
Note that the model has converged 7 times before failing once. Here is how the wrapper, mimpt, works:
I have typed
where the required option skipnonconvergence() specifies how many errors due to non-convergence to ignore. Here, I am willing to ignore 5 such errors. The warning message informs me that the model did not converge 2 times. Had the model failed to converge more than 5 times, the result would have been the same as with mi impute chained: mimpt would have exited with return code r(430) and discarded all imputed values.
The output reveals how mimpt works: it repeatedly calls mi impute and adds 1 complete dataset at a time. If there is an error, the imputation of the respective dataset, say, m=1, is repeated. There are side-effects: the model specification must be repeatedly parsed by mi impute, any warning message (or their absence) of mi impute refers only to the last imputed dataset, any results that mi impute returns in r() hold refer only to the last imputed dataset. All this is to say: mimpt is a workaround that should be used with caution and should be replaced by a respective option in Stata's mi impute command.
For those of you, who have experienced the described problem with non-convergence, who agree with my argument, and who, for whatever reasons, want to stick with mi instead of ice, mimpt is available from the SSC. Thanks, as usual, to Kit Baum.
Best
Daniel
My problem is with the behavior of mi impute chained when one of the models, usually mlogit, fails to converge. If that happens, Stata will exit with an error message and, more relevant to this post, discard all imputed values that have been added so far. This behavior is consistent with Stata's philosophy of "doing it all or doing nothing at all". It is also useful if there is something wrong with the imputation model that we should fix. The behavior is, however, frustrating if the model in question fails to converge in, say, iteration 7 on m=42. By then, the respective model has successfully converged 416 times (assuming the default burin-in) before it failed -- once. Chances are, there is no systematic problem with that model; chances are, the model will converge again in iteration 8 on m=42.
I argue that stopping the imputation process altogether because one of the models fails to converge once is not only frustrating but also leads to worse imputation models in practice. The reason is that confronted with the described problem, we, as users, are left with one choice: modify the respective model. There are different ways of modifying the model, such as omitting predictors, change (collapse) some categories of the outcome, or use a different model, e.g., pmm. Neither of these modifications is desirable and all of them will necessarily affect all iterations in all imputations, thus, making the imputation model worse. Instead of affecting all iterations in all imputations, I would rather be able to skip the one iteration in which the model happens not to converge.
The community-contributed ice command (Royston, SSC, SJ) offers a persist option that ignores errors, such as non-convergence. It would be even better if we could specify which errors we are willing to ignore and how often we are willing to ignore them. Still, this option is something that StataCorp should seriously consider borrowing. Personally, I trust ice but I am just a little bit more comfortable with Stata's mi suit. Therefore, I have written a crude workaround wrapper for mi impute that persists in case of non-convergence. Here is an example, using a modified version of auto.dta:
Code:
version 12.1 // needed for the seed set seed 42 set maxiter 20 // don't want to wait for 16,000 iterations // example data sysuse auto , clear replace mpg = . if runiform()>.6 replace price = . if runiform()>.4 // mi setting mi set mlong mi register imp rep78 mpg price
Code:
. mi impute chained /// > (mlogit , augment) rep78 /// > (pmm , knn(3)) mpg price /// > , add(10) noisily Conditional models: rep78: mlogit rep78 mpg price , augment noisily mpg: pmm mpg i.rep78 price , knn(3) noisily price: pmm price i.rep78 mpg , knn(3) noisily Performing monotone imputation, m=1: Running mlogit on observed data, m=1: Iteration 0: log likelihood = -93.692061 Iteration 1: log likelihood = -93.692061 Multinomial logistic regression Number of obs = 69 LR chi2(0) = 0.00 Prob > chi2 = . Log likelihood = -93.692061 Pseudo R2 = 0.0000 ------------------------------------------------------------------------------ rep78 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 1 | _cons | -2.70805 .7302967 -3.71 0.000 -4.139406 -1.276695 -------------+---------------------------------------------------------------- 2 | _cons | -1.321756 .3979112 -3.32 0.001 -2.101647 -.5418642 -------------+---------------------------------------------------------------- 3 | (base outcome) -------------+---------------------------------------------------------------- 4 | _cons | -.5108256 .2981424 -1.71 0.087 -1.095174 .0735227 -------------+---------------------------------------------------------------- 5 | _cons | -1.003302 .3524804 -2.85 0.004 -1.694151 -.3124532 ------------------------------------------------------------------------------ [...] Running mlogit on data from iteration 8, m=1: Iteration 0: log likelihood = -93.692061 Iteration 1: log likelihood = -84.819893 Iteration 2: log likelihood = -81.752821 Iteration 3: log likelihood = -79.824403 Iteration 4: log likelihood = -79.07954 Iteration 5: log likelihood = -78.816167 Iteration 6: log likelihood = -78.665878 Iteration 7: log likelihood = -78.582992 Iteration 8: log likelihood = -78.566297 Iteration 9: log likelihood = -78.562641 Iteration 10: log likelihood = -78.561756 Iteration 11: log likelihood = -78.561571 Iteration 12: log likelihood = -78.561532 (not concave) Iteration 13: log likelihood = -78.561531 (not concave) Iteration 14: log likelihood = -78.56153 (not concave) Iteration 15: log likelihood = -78.56153 (not concave) Iteration 16: log likelihood = -78.56153 (not concave) Iteration 17: log likelihood = -78.56153 (not concave) Iteration 18: log likelihood = -78.56153 (not concave) Iteration 19: log likelihood = -78.56153 (not concave) Iteration 20: log likelihood = -78.56153 (not concave) convergence not achieved Multinomial logistic regression Number of obs = 69 LR chi2(7) = 30.26 Prob > chi2 = 0.0001 Log likelihood = -78.56153 Pseudo R2 = 0.1615 ------------------------------------------------------------------------------ rep78 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 1 | mpg | -19.66078 405.0998 -0.05 0.961 -813.6419 774.3203 price | -.0710612 1.818402 -0.04 0.969 -3.635063 3.492941 _cons | 639.3719 . . . . . -------------+---------------------------------------------------------------- 2 | mpg | -.089131 .1029034 -0.87 0.386 -.2908178 .1125559 price | .0001489 .0001166 1.28 0.202 -.0000797 .0003774 _cons | -.8377781 2.345939 -0.36 0.721 -5.435735 3.760178 -------------+---------------------------------------------------------------- 3 | (base outcome) -------------+---------------------------------------------------------------- 4 | mpg | .075735 .0570139 1.33 0.184 -.0360101 .1874802 price | .0000827 .0000967 0.85 0.393 -.0001068 .0002722 _cons | -2.645726 1.600045 -1.65 0.098 -5.781756 .4903036 -------------+---------------------------------------------------------------- 5 | mpg | .1692951 .0652532 2.59 0.009 .0414013 .297189 price | .0000554 .0001334 0.42 0.678 -.000206 .0003169 _cons | -5.221147 2.054025 -2.54 0.011 -9.246962 -1.195332 ------------------------------------------------------------------------------ Note: 1 observation completely determined. Standard errors questionable. convergence not achieved mlogit failed to converge on observed data error occurred during imputation of rep78 mpg price on m = 1 r(430);
Note that the model has converged 7 times before failing once. Here is how the wrapper, mimpt, works:
Code:
. mimpt chained /// > (mlogit , augment) rep78 /// > (pmm , knn(3)) mpg price /// > , add(10) skipnonconvergence(5) Conditional models: rep78: mlogit rep78 mpg price , augment mpg: pmm mpg i.rep78 price , knn(3) price: pmm price i.rep78 mpg , knn(3) Performing chained iterations ... convergence not achieved convergence not achieved mlogit failed to converge on observed data error occurred during imputation of rep78 mpg price on m = 1 [...] Conditional models: rep78: mlogit rep78 mpg price , augment mpg: pmm mpg i.rep78 price , knn(3) price: pmm price i.rep78 mpg , knn(3) Performing chained iterations ... Multivariate imputation Imputations = 10 Chained equations added = 1 Imputed: m=10 updated = 0 Initialization: monotone Iterations = 10 burn-in = 10 rep78: augmented multinomial logistic regression mpg: predictive mean matching price: predictive mean matching ------------------------------------------------------------------ | Observations per m |---------------------------------------------- Variable | Complete Incomplete Imputed | Total -------------------+-----------------------------------+---------- rep78 | 69 5 5 | 74 mpg | 44 30 30 | 74 price | 29 45 45 | 74 ------------------------------------------------------------------ (complete + incomplete = total; imputed is the minimum across m of the number of filled-in observations.) Warning: the sets of predictors of the imputation model vary across imputations or iterations Warning: the imputation model failed to converge 2 times
Code:
mimpt ... , skipnonconvergence(5)
The output reveals how mimpt works: it repeatedly calls mi impute and adds 1 complete dataset at a time. If there is an error, the imputation of the respective dataset, say, m=1, is repeated. There are side-effects: the model specification must be repeatedly parsed by mi impute, any warning message (or their absence) of mi impute refers only to the last imputed dataset, any results that mi impute returns in r() hold refer only to the last imputed dataset. All this is to say: mimpt is a workaround that should be used with caution and should be replaced by a respective option in Stata's mi impute command.
For those of you, who have experienced the described problem with non-convergence, who agree with my argument, and who, for whatever reasons, want to stick with mi instead of ice, mimpt is available from the SSC. Thanks, as usual, to Kit Baum.
Best
Daniel
Comment