Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • -mi impute-, -mlogit-, and "convergence not achieved"

    I will start this somewhat lenghty post by repeating my earlier requests on the Wishlist for Stata 17 and the Wishlist for Stata 16.

    My problem is with the behavior of mi impute chained when one of the models, usually mlogit, fails to converge. If that happens, Stata will exit with an error message and, more relevant to this post, discard all imputed values that have been added so far. This behavior is consistent with Stata's philosophy of "doing it all or doing nothing at all". It is also useful if there is something wrong with the imputation model that we should fix. The behavior is, however, frustrating if the model in question fails to converge in, say, iteration 7 on m=42. By then, the respective model has successfully converged 416 times (assuming the default burin-in) before it failed -- once. Chances are, there is no systematic problem with that model; chances are, the model will converge again in iteration 8 on m=42.

    I argue that stopping the imputation process altogether because one of the models fails to converge once is not only frustrating but also leads to worse imputation models in practice. The reason is that confronted with the described problem, we, as users, are left with one choice: modify the respective model. There are different ways of modifying the model, such as omitting predictors, change (collapse) some categories of the outcome, or use a different model, e.g., pmm. Neither of these modifications is desirable and all of them will necessarily affect all iterations in all imputations, thus, making the imputation model worse. Instead of affecting all iterations in all imputations, I would rather be able to skip the one iteration in which the model happens not to converge.

    The community-contributed ice command (Royston, SSC, SJ) offers a persist option that ignores errors, such as non-convergence. It would be even better if we could specify which errors we are willing to ignore and how often we are willing to ignore them. Still, this option is something that StataCorp should seriously consider borrowing. Personally, I trust ice but I am just a little bit more comfortable with Stata's mi suit. Therefore, I have written a crude workaround wrapper for mi impute that persists in case of non-convergence. Here is an example, using a modified version of auto.dta:

    Code:
    version 12.1 // needed for the seed
    set seed 42
    
    set maxiter 20 // don't want to wait for 16,000 iterations
    
    // example data
    sysuse auto , clear
    replace mpg = . if runiform()>.6
    replace price = . if runiform()>.4
    
    // mi setting
    mi set mlong
    mi register imp rep78 mpg price
    The modification above lead to convergence issues when we impute missing values for rep78 with mlogit:

    Code:
    . mi impute chained             ///
    >     (mlogit , augment) rep78  ///
    >     (pmm , knn(3)) mpg price  ///
    >     , add(10) noisily
    
    
    Conditional models:
                 rep78: mlogit rep78 mpg price , augment noisily
                   mpg: pmm mpg i.rep78 price , knn(3) noisily
                 price: pmm price i.rep78 mpg , knn(3) noisily
    
    
    Performing monotone imputation, m=1:
    
    Running mlogit on observed data, m=1:
    
    
    Iteration 0:   log likelihood = -93.692061  
    Iteration 1:   log likelihood = -93.692061  
    
    Multinomial logistic regression                 Number of obs     =         69
                                                    LR chi2(0)        =       0.00
                                                    Prob > chi2       =          .
    Log likelihood = -93.692061                     Pseudo R2         =     0.0000
    
    ------------------------------------------------------------------------------
           rep78 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    1            |
           _cons |   -2.70805   .7302967    -3.71   0.000    -4.139406   -1.276695
    -------------+----------------------------------------------------------------
    2            |
           _cons |  -1.321756   .3979112    -3.32   0.001    -2.101647   -.5418642
    -------------+----------------------------------------------------------------
    3            |  (base outcome)
    -------------+----------------------------------------------------------------
    4            |
           _cons |  -.5108256   .2981424    -1.71   0.087    -1.095174    .0735227
    -------------+----------------------------------------------------------------
    5            |
           _cons |  -1.003302   .3524804    -2.85   0.004    -1.694151   -.3124532
    ------------------------------------------------------------------------------
    
    [...]
    
    Running mlogit on data from iteration 8, m=1:
    
    
    Iteration 0:   log likelihood = -93.692061  
    Iteration 1:   log likelihood = -84.819893  
    Iteration 2:   log likelihood = -81.752821  
    Iteration 3:   log likelihood = -79.824403  
    Iteration 4:   log likelihood =  -79.07954  
    Iteration 5:   log likelihood = -78.816167  
    Iteration 6:   log likelihood = -78.665878  
    Iteration 7:   log likelihood = -78.582992  
    Iteration 8:   log likelihood = -78.566297  
    Iteration 9:   log likelihood = -78.562641  
    Iteration 10:  log likelihood = -78.561756  
    Iteration 11:  log likelihood = -78.561571  
    Iteration 12:  log likelihood = -78.561532  (not concave)
    Iteration 13:  log likelihood = -78.561531  (not concave)
    Iteration 14:  log likelihood =  -78.56153  (not concave)
    Iteration 15:  log likelihood =  -78.56153  (not concave)
    Iteration 16:  log likelihood =  -78.56153  (not concave)
    Iteration 17:  log likelihood =  -78.56153  (not concave)
    Iteration 18:  log likelihood =  -78.56153  (not concave)
    Iteration 19:  log likelihood =  -78.56153  (not concave)
    Iteration 20:  log likelihood =  -78.56153  (not concave)
    convergence not achieved
    
    Multinomial logistic regression                 Number of obs     =         69
                                                    LR chi2(7)        =      30.26
                                                    Prob > chi2       =     0.0001
    Log likelihood =  -78.56153                     Pseudo R2         =     0.1615
    
    ------------------------------------------------------------------------------
           rep78 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    1            |
             mpg |  -19.66078   405.0998    -0.05   0.961    -813.6419    774.3203
           price |  -.0710612   1.818402    -0.04   0.969    -3.635063    3.492941
           _cons |   639.3719          .        .       .            .           .
    -------------+----------------------------------------------------------------
    2            |
             mpg |   -.089131   .1029034    -0.87   0.386    -.2908178    .1125559
           price |   .0001489   .0001166     1.28   0.202    -.0000797    .0003774
           _cons |  -.8377781   2.345939    -0.36   0.721    -5.435735    3.760178
    -------------+----------------------------------------------------------------
    3            |  (base outcome)
    -------------+----------------------------------------------------------------
    4            |
             mpg |    .075735   .0570139     1.33   0.184    -.0360101    .1874802
           price |   .0000827   .0000967     0.85   0.393    -.0001068    .0002722
           _cons |  -2.645726   1.600045    -1.65   0.098    -5.781756    .4903036
    -------------+----------------------------------------------------------------
    5            |
             mpg |   .1692951   .0652532     2.59   0.009     .0414013     .297189
           price |   .0000554   .0001334     0.42   0.678     -.000206    .0003169
           _cons |  -5.221147   2.054025    -2.54   0.011    -9.246962   -1.195332
    ------------------------------------------------------------------------------
    Note: 1 observation completely determined.  Standard errors questionable.
    convergence not achieved
    mlogit failed to converge on observed data
    error occurred during imputation of rep78 mpg price on m = 1
    r(430);

    Note that the model has converged 7 times before failing once. Here is how the wrapper, mimpt, works:

    Code:
    . mimpt chained                 ///
    >     (mlogit , augment) rep78  ///
    >     (pmm , knn(3)) mpg price  ///
    >     , add(10) skipnonconvergence(5)
    
    
    Conditional models:
                 rep78: mlogit rep78 mpg price , augment
                   mpg: pmm mpg i.rep78 price , knn(3)
                 price: pmm price i.rep78 mpg , knn(3)
    
    Performing chained iterations ...
    convergence not achieved
    convergence not achieved
    mlogit failed to converge on observed data
    error occurred during imputation of rep78 mpg price on m = 1
    
    [...]
    
    Conditional models:
                 rep78: mlogit rep78 mpg price , augment
                   mpg: pmm mpg i.rep78 price , knn(3)
                 price: pmm price i.rep78 mpg , knn(3)
    
    Performing chained iterations ...
    
    Multivariate imputation                     Imputations =       10
    Chained equations                                 added =        1
    Imputed: m=10                                   updated =        0
    
    Initialization: monotone                     Iterations =       10
                                                    burn-in =       10
    
                 rep78: augmented multinomial logistic regression
                   mpg: predictive mean matching
                 price: predictive mean matching
    
    ------------------------------------------------------------------
                       |               Observations per m            
                       |----------------------------------------------
              Variable |   Complete   Incomplete   Imputed |     Total
    -------------------+-----------------------------------+----------
                 rep78 |         69            5         5 |        74
                   mpg |         44           30        30 |        74
                 price |         29           45        45 |        74
    ------------------------------------------------------------------
    (complete + incomplete = total; imputed is the minimum across m
     of the number of filled-in observations.)
    
    Warning: the sets of predictors of the imputation model vary across
             imputations or iterations
    
    Warning: the imputation model failed to converge 2 times
    I have typed

    Code:
    mimpt ... , skipnonconvergence(5)
    where the required option skipnonconvergence() specifies how many errors due to non-convergence to ignore. Here, I am willing to ignore 5 such errors. The warning message informs me that the model did not converge 2 times. Had the model failed to converge more than 5 times, the result would have been the same as with mi impute chained: mimpt would have exited with return code r(430) and discarded all imputed values.

    The output reveals how mimpt works: it repeatedly calls mi impute and adds 1 complete dataset at a time. If there is an error, the imputation of the respective dataset, say, m=1, is repeated. There are side-effects: the model specification must be repeatedly parsed by mi impute, any warning message (or their absence) of mi impute refers only to the last imputed dataset, any results that mi impute returns in r() hold refer only to the last imputed dataset. All this is to say: mimpt is a workaround that should be used with caution and should be replaced by a respective option in Stata's mi impute command.

    For those of you, who have experienced the described problem with non-convergence, who agree with my argument, and who, for whatever reasons, want to stick with
    mi instead of ice, mimpt is available from the SSC. Thanks, as usual, to Kit Baum.

    Best
    Daniel
    Last edited by daniel klein; 02 Mar 2021, 16:21.

  • #2
    daniel klein thank you for this - I have often had the same problem with mlogit (and sometimes with ologit within mi impute chained) also; I have previously spoken with some people at StataCorp about including something like "persist" from -ice-; in general, the reaction was negative because it did not provide information about the problem or the number of extra attempts - I then suggested that they included that info in their version; so far, nothing and I generally start with -ice- these days if I have categorical variables with more than 2 categories

    Comment


    • #3
      Thanks Daniel, that is really a relevant issue. I am not an expert with imputation but need to work with it regularly. These issues you describe do happen from time to time and are indeed frustrating. This "tweaking and tinkering" with the MI model feels a bit like black magic and not very "scientific". One crude idea (I have never actually implemented but it came to my mind) was to run the model in a stepwise fashion, so always only add 1 imputation, see if it converges and then change the seed. By doing so you can carefully add more imputations and if it fails you try again and keep the data imputed so far. I wonder if this would work but it seems tedious.

      Best
      Felix
      Best wishes

      (Stata 16.1 MP)

      Comment


      • #4
        Originally posted by Rich Goldstein View Post
        in general, the reaction was negative because it did not provide information about the problem or the number of extra attempts
        Interesting because I do not think this is true. ice will produce a warning message each time it skips an iteration. The warning messages also include the return code. It should be trivial to collect this information and present a summary at the end of the process so users will not have to constantly monitor the process or scroll through endless log-files. Anyway, I, too, would prefer a solution that provides more control over which errors are ignored and how often. I assume that this should also be trivial to implement; I also assume that StataCorp's priorities lie somewhere else.


        Originally posted by Felix Bittmann View Post
        This "tweaking and tinkering" with the MI model feels a bit like black magic and not very "scientific".
        It is "not scienticifc" in the sense that I am not aware of studies that investigate the effects of what I am doing here. I still believe that the alternatives are worse.

        Originally posted by Felix Bittmann View Post
        One crude idea (I have never actually implemented but it came to my mind) was to run the model in a stepwise fashion, so always only add 1 imputation, see if it converges and then change the seed.
        This is what mimpt does -- almost. It does not change/set the seed multiple times and neither should you. I should have mentioned tinkering with the seed in the list of undesirable alternative "solutions" to the problem. If you think in the extreme, setting the seed after each imputed dataset will make the imputed data essentially a function of the choice for your seeds. This is easier to see in the simplest case of obtaining random numbers from runiform(). The sequence

        Code:
        set seed 1
        runiform()
        set seed 2
        uniform()
        ...
        is no different from the sequence

        Code:
        .012345 // chosen by me "randomly"
        .678901 // chosen by me "randomly"
        [..]
        That is, setting the seed each time is the same as setting the random number directly each time. It is very unlikely to get a random sequence this way.


        Concerning the motivation of the topic, I am actually pinning some hope on Paul Allison and Richard Williams. They have been working on ways to get better predicted probabilities from linear probability models. If this work could be extended to some sort of "multinomial linear probability model", which I do not whether it exists but think in the direction of a multivariate linear model, this would solve the convergence issues and also speed up the imputation process considerably.
        .
        Last edited by daniel klein; 03 Mar 2021, 03:43. Reason: fixed the link

        Comment


        • #5
          Thank you for your post. I am new to Stata and i am just experiencing everything you described. So i use -ice- but my problem is we still have to convert it to mi to do some analysis.

          Running the -mi import ice- command keeps giving me error 'data already -mi set-' Please how do i convert from -ice- to -mi-?

          My steps currently:
          -create my -ice- document (impstat) using the ice command for a 100 imputations
          -use impstat, clear
          -mi import ice, automatic

          Many Thanks
          Maryam

          Comment

          Working...
          X