-mi impute-, -mlogit-, and "convergence not achieved"

daniel klein

Join Date: Mar 2014
Posts: 3849

-mi impute-, -mlogit-, and "convergence not achieved"

02 Mar 2021, 16:12

I will start this somewhat lenghty post by repeating my earlier requests on the Wishlist for Stata 17 and the Wishlist for Stata 16.

My problem is with the behavior of mi impute chained when one of the models, usually mlogit, fails to converge. If that happens, Stata will exit with an error message and, more relevant to this post, discard all imputed values that have been added so far. This behavior is consistent with Stata's philosophy of "doing it all or doing nothing at all". It is also useful if there is something wrong with the imputation model that we should fix. The behavior is, however, frustrating if the model in question fails to converge in, say, iteration 7 on m=42. By then, the respective model has successfully converged 416 times (assuming the default burin-in) before it failed -- once. Chances are, there is no systematic problem with that model; chances are, the model will converge again in iteration 8 on m=42.

I argue that stopping the imputation process altogether because one of the models fails to converge once is not only frustrating but also leads to worse imputation models in practice. The reason is that confronted with the described problem, we, as users, are left with one choice: modify the respective model. There are different ways of modifying the model, such as omitting predictors, change (collapse) some categories of the outcome, or use a different model, e.g., pmm. Neither of these modifications is desirable and all of them will necessarily affect all iterations in all imputations, thus, making the imputation model worse. Instead of affecting all iterations in all imputations, I would rather be able to skip the one iteration in which the model happens not to converge.

The community-contributed ice command (Royston, SSC, SJ) offers a persist option that ignores errors, such as non-convergence. It would be even better if we could specify which errors we are willing to ignore and how often we are willing to ignore them. Still, this option is something that StataCorp should seriously consider borrowing. Personally, I trust ice but I am just a little bit more comfortable with Stata's mi suit. Therefore, I have written a crude workaround wrapper for mi impute that persists in case of non-convergence. Here is an example, using a modified version of auto.dta:

Code:

version 12.1 // needed for the seed
set seed 42

set maxiter 20 // don't want to wait for 16,000 iterations

// example data
sysuse auto , clear
replace mpg = . if runiform()>.6
replace price = . if runiform()>.4

// mi setting
mi set mlong
mi register imp rep78 mpg price

The modification above lead to convergence issues when we impute missing values for rep78 with mlogit:

Code:

. mi impute chained             ///
>     (mlogit , augment) rep78  ///
>     (pmm , knn(3)) mpg price  ///
>     , add(10) noisily


Conditional models:
             rep78: mlogit rep78 mpg price , augment noisily
               mpg: pmm mpg i.rep78 price , knn(3) noisily
             price: pmm price i.rep78 mpg , knn(3) noisily


Performing monotone imputation, m=1:

Running mlogit on observed data, m=1:


Iteration 0:   log likelihood = -93.692061  
Iteration 1:   log likelihood = -93.692061  

Multinomial logistic regression                 Number of obs     =         69
                                                LR chi2(0)        =       0.00
                                                Prob > chi2       =          .
Log likelihood = -93.692061                     Pseudo R2         =     0.0000

------------------------------------------------------------------------------
       rep78 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
1            |
       _cons |   -2.70805   .7302967    -3.71   0.000    -4.139406   -1.276695
-------------+----------------------------------------------------------------
2            |
       _cons |  -1.321756   .3979112    -3.32   0.001    -2.101647   -.5418642
-------------+----------------------------------------------------------------
3            |  (base outcome)
-------------+----------------------------------------------------------------
4            |
       _cons |  -.5108256   .2981424    -1.71   0.087    -1.095174    .0735227
-------------+----------------------------------------------------------------
5            |
       _cons |  -1.003302   .3524804    -2.85   0.004    -1.694151   -.3124532
------------------------------------------------------------------------------

[...]

Running mlogit on data from iteration 8, m=1:


Iteration 0:   log likelihood = -93.692061  
Iteration 1:   log likelihood = -84.819893  
Iteration 2:   log likelihood = -81.752821  
Iteration 3:   log likelihood = -79.824403  
Iteration 4:   log likelihood =  -79.07954  
Iteration 5:   log likelihood = -78.816167  
Iteration 6:   log likelihood = -78.665878  
Iteration 7:   log likelihood = -78.582992  
Iteration 8:   log likelihood = -78.566297  
Iteration 9:   log likelihood = -78.562641  
Iteration 10:  log likelihood = -78.561756  
Iteration 11:  log likelihood = -78.561571  
Iteration 12:  log likelihood = -78.561532  (not concave)
Iteration 13:  log likelihood = -78.561531  (not concave)
Iteration 14:  log likelihood =  -78.56153  (not concave)
Iteration 15:  log likelihood =  -78.56153  (not concave)
Iteration 16:  log likelihood =  -78.56153  (not concave)
Iteration 17:  log likelihood =  -78.56153  (not concave)
Iteration 18:  log likelihood =  -78.56153  (not concave)
Iteration 19:  log likelihood =  -78.56153  (not concave)
Iteration 20:  log likelihood =  -78.56153  (not concave)
convergence not achieved

Multinomial logistic regression                 Number of obs     =         69
                                                LR chi2(7)        =      30.26
                                                Prob > chi2       =     0.0001
Log likelihood =  -78.56153                     Pseudo R2         =     0.1615

------------------------------------------------------------------------------
       rep78 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
1            |
         mpg |  -19.66078   405.0998    -0.05   0.961    -813.6419    774.3203
       price |  -.0710612   1.818402    -0.04   0.969    -3.635063    3.492941
       _cons |   639.3719          .        .       .            .           .
-------------+----------------------------------------------------------------
2            |
         mpg |   -.089131   .1029034    -0.87   0.386    -.2908178    .1125559
       price |   .0001489   .0001166     1.28   0.202    -.0000797    .0003774
       _cons |  -.8377781   2.345939    -0.36   0.721    -5.435735    3.760178
-------------+----------------------------------------------------------------
3            |  (base outcome)
-------------+----------------------------------------------------------------
4            |
         mpg |    .075735   .0570139     1.33   0.184    -.0360101    .1874802
       price |   .0000827   .0000967     0.85   0.393    -.0001068    .0002722
       _cons |  -2.645726   1.600045    -1.65   0.098    -5.781756    .4903036
-------------+----------------------------------------------------------------
5            |
         mpg |   .1692951   .0652532     2.59   0.009     .0414013     .297189
       price |   .0000554   .0001334     0.42   0.678     -.000206    .0003169
       _cons |  -5.221147   2.054025    -2.54   0.011    -9.246962   -1.195332
------------------------------------------------------------------------------
Note: 1 observation completely determined.  Standard errors questionable.
convergence not achieved
mlogit failed to converge on observed data
error occurred during imputation of rep78 mpg price on m = 1
r(430);

Note that the model has converged 7 times before failing once. Here is how the wrapper, mimpt, works:

Code:

. mimpt chained                 ///
>     (mlogit , augment) rep78  ///
>     (pmm , knn(3)) mpg price  ///
>     , add(10) skipnonconvergence(5)


Conditional models:
             rep78: mlogit rep78 mpg price , augment
               mpg: pmm mpg i.rep78 price , knn(3)
             price: pmm price i.rep78 mpg , knn(3)

Performing chained iterations ...
convergence not achieved
convergence not achieved
mlogit failed to converge on observed data
error occurred during imputation of rep78 mpg price on m = 1

[...]

Conditional models:
             rep78: mlogit rep78 mpg price , augment
               mpg: pmm mpg i.rep78 price , knn(3)
             price: pmm price i.rep78 mpg , knn(3)

Performing chained iterations ...

Multivariate imputation                     Imputations =       10
Chained equations                                 added =        1
Imputed: m=10                                   updated =        0

Initialization: monotone                     Iterations =       10
                                                burn-in =       10

             rep78: augmented multinomial logistic regression
               mpg: predictive mean matching
             price: predictive mean matching

------------------------------------------------------------------
                   |               Observations per m            
                   |----------------------------------------------
          Variable |   Complete   Incomplete   Imputed |     Total
-------------------+-----------------------------------+----------
             rep78 |         69            5         5 |        74
               mpg |         44           30        30 |        74
             price |         29           45        45 |        74
------------------------------------------------------------------
(complete + incomplete = total; imputed is the minimum across m
 of the number of filled-in observations.)

Warning: the sets of predictors of the imputation model vary across
         imputations or iterations

Warning: the imputation model failed to converge 2 times

I have typed

Code:

mimpt ... , skipnonconvergence(5)

where the required option skipnonconvergence() specifies how many errors due to non-convergence to ignore. Here, I am willing to ignore 5 such errors. The warning message informs me that the model did not converge 2 times. Had the model failed to converge more than 5 times, the result would have been the same as with mi impute chained: mimpt would have exited with return code r(430) and discarded all imputed values.

The output reveals how mimpt works: it repeatedly calls mi impute and adds 1 complete dataset at a time. If there is an error, the imputation of the respective dataset, say, m=1, is repeated. There are side-effects: the model specification must be repeatedly parsed by mi impute, any warning message (or their absence) of mi impute refers only to the last imputed dataset, any results that mi impute returns in r() hold refer only to the last imputed dataset. All this is to say: mimpt is a workaround that should be used with caution and should be replaced by a respective option in Stata's mi impute command.

For those of you, who have experienced the described problem with non-convergence, who agree with my argument, and who, for whatever reasons, want to stick with mi instead of ice, mimpt is available from the SSC. Thanks, as usual, to Kit Baum.

Best
Daniel

Last edited by daniel klein; 02 Mar 2021, 16:21.

Tags: mlogit, multiple imputation

Rich Goldstein

Join Date: Mar 2014

Posts: 4462
#2

02 Mar 2021, 19:12

daniel klein thank you for this - I have often had the same problem with mlogit (and sometimes with ologit within mi impute chained) also; I have previously spoken with some people at StataCorp about including something like "persist" from -ice-; in general, the reaction was negative because it did not provide information about the problem or the number of extra attempts - I then suggested that they included that info in their version; so far, nothing and I generally start with -ice- these days if I have categorical variables with more than 2 categories
Comment
Felix Bittmann

Join Date: Aug 2018

Posts: 693
#3

03 Mar 2021, 00:15

Thanks Daniel, that is really a relevant issue. I am not an expert with imputation but need to work with it regularly. These issues you describe do happen from time to time and are indeed frustrating. This "tweaking and tinkering" with the MI model feels a bit like black magic and not very "scientific". One crude idea (I have never actually implemented but it came to my mind) was to run the model in a stepwise fashion, so always only add 1 imputation, see if it converges and then change the seed. By doing so you can carefully add more imputations and if it fails you try again and keep the data imputed so far. I wonder if this would work but it seems tedious.

Best
Felix

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment
daniel klein

Join Date: Mar 2014

Posts: 3849
#4

03 Mar 2021, 02:48

Originally posted by Rich Goldstein View Post

in general, the reaction was negative because it did not provide information about the problem or the number of extra attempts

Interesting because I do not think this is true. ice will produce a warning message each time it skips an iteration. The warning messages also include the return code. It should be trivial to collect this information and present a summary at the end of the process so users will not have to constantly monitor the process or scroll through endless log-files. Anyway, I, too, would prefer a solution that provides more control over which errors are ignored and how often. I assume that this should also be trivial to implement; I also assume that StataCorp's priorities lie somewhere else.

Originally posted by Felix Bittmann View Post

This "tweaking and tinkering" with the MI model feels a bit like black magic and not very "scientific".

It is "not scienticifc" in the sense that I am not aware of studies that investigate the effects of what I am doing here. I still believe that the alternatives are worse.

Originally posted by Felix Bittmann View Post

One crude idea (I have never actually implemented but it came to my mind) was to run the model in a stepwise fashion, so always only add 1 imputation, see if it converges and then change the seed.

This is what mimpt does -- almost. It does not change/set the seed multiple times and neither should you. I should have mentioned tinkering with the seed in the list of undesirable alternative "solutions" to the problem. If you think in the extreme, setting the seed after each imputed dataset will make the imputed data essentially a function of the choice for your seeds. This is easier to see in the simplest case of obtaining random numbers from runiform(). The sequence

Code:

set seed 1 runiform() set seed 2 uniform() ...

is no different from the sequence

Code:

.012345 // chosen by me "randomly" .678901 // chosen by me "randomly" [..]

That is, setting the seed each time is the same as setting the random number directly each time. It is very unlikely to get a random sequence this way.

Concerning the motivation of the topic, I am actually pinning some hope on Paul Allison and Richard Williams. They have been working on ways to get better predicted probabilities from linear probability models. If this work could be extended to some sort of "multinomial linear probability model", which I do not whether it exists but think in the direction of a multivariate linear model, this would solve the convergence issues and also speed up the imputation process considerably.
.

Last edited by daniel klein; 03 Mar 2021, 03:43. Reason: fixed the link
Comment
Maryam Bashir

Join Date: Aug 2021

Posts: 5
#5

27 Aug 2021, 13:27

Thank you for your post. I am new to Stata and i am just experiencing everything you described. So i use -ice- but my problem is we still have to convert it to mi to do some analysis.

Running the -mi import ice- command keeps giving me error 'data already -mi set-' Please how do i convert from -ice- to -mi-?

My steps currently:
-create my -ice- document (impstat) using the ice command for a 100 imputations
-use impstat, clear
-mi import ice, automatic

Many Thanks
Maryam
Comment

Announcement

-mi impute-, -mlogit-, and "convergence not achieved"

Comment

Comment

Comment

Comment