Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple imputation convergence not achieved

    Hi all,

    I'm trying to perform chained multiple imputations but get an error message about convergence. My variables are not highly correlated with one another. I end up with 15,600 - 23,700 observations at the end though (I initially had a maximum of 698 observations). Any thoughts about what I should do?
    Below are my code, the error message I get and some screenshots with information about my data
    Thanks

    Data
    24 potential covariate factors that I will have to correlate with my DVs and IV to choose which ones to include in my final regression model
    3 continuous DVs that I will independently examine (VS, PL, CE)
    1 categorical variable with 6 categories (traj_alcohol) - does not have missing data (Note. I tried to include a "=" sign and place this variable on the right-hand side but I had an error message about having missing data in it)

    Code
    Code:
    mi set mlong
    
    mi register imputed post_smoking preeclampsia preterm pregnancy_planning DASS_stress DASS_anx DASS_depr parity parenting diet mat_ment_health folate fad ethnicity psy breastfeed antidep_combined SRI SGA BMI ax drug VS PL CE child_sex  pren_smoking traj_alcohol
    
    mi impute chained (regress) mat_ment_health parity folate fad VS PL CE (logit, augment) ethnicity pregnancy_planning SRI preeclampsia preterm SGA drug DASS_stress DASS_depr DASS_anx antidep_combined post_smoking (mlogit, augment) psy breastfeed parenting BMI diet ax = traj_alcohol child_sex pren_smoking, add(43) noisily

    Output
    Code:
    Note: 92 failures and 1 success completely determined.
    convergence not achieved
    logit failed to converge on observed data
    error occurred during imputation of mat_ment_health parity folate fad VS PL CE ethnicity pregnancy_planning SRI preeclampsia preterm SGA drug
    DASS_stress DASS_depr DASS_anx antidep_combined post_smoking psy breastfeed parenting BMI diet ax on m = 4
    Screenshots
    Summary of my variables post imputation
    Variable manager view
    Attached Files

  • #2
    You have started multiple threads (e.g. here). asking about essentially the same problem. That is not a good idea because it is then hard to keep track of all the information and make sense of what you describe. If you plan on continuing in this fashion, do include links to the relevant other thredas. Also, please do not use screenshots to show output as those might not be equally readable across browsers etc.; copy the output into a code-delimited environment (that you also use ot show code) instead.

    Concering your main question here:

    Originally posted by Garance Delagneau View Post
    Hi all,
    I end up with 15,600 - 23,700 observations at the end though (I initially had a maximum of 698 observations). Any thoughts about what I should do?
    Multiple imputations create multiple complete datasets. Those datasets have to be stored in some way. You have chosen mlong style, meaning that the imputed values are stored as additional observations. Thus, you are supposed to end up with more observations after mi.


    Concering questions from your previous thread:

    Theoretically, you want to include all variables (especially the outcome/response/dependent variable) that you will use in your analyses in your imputation model. Your imputation model may contain additional variables that are not part of the analyses. Omitting variables from the imputation model will bias the respective correlations towards zero. If your imputation model does not contain more variables than your analyses model, then including observations with missing values on the outcome might result in increased standard errors; you might want to omit those observations from the analyses.

    Technically, before you go on, clear all data from memory and start fresh with your original dataset. Then, fix your two main problems: missing imputed values and non-convergence. Start with the former. Make sure that the variables you put on the right side of the equals sign in mi impute do not have missing values (do not include any string variables in the model!). If you are still getting an error due to missing imputed values, get back here, preferably with example data. Do not use the force option; it solves nothing!

    You want to add the non-documented option showcommand to your mi impute call, as in

    Code:
    mi impute chained ... , add(#) augment noisily showcommand
    This option makes it easier to identify which of the model fails. In the case of non-convergence, missing standard errors might provide a hint to which predictors are potentially problematic. If non-convergence occurs in the observed data, or if it occurs repeatedly, you might need to remove those predictors from the respective equation.

    Comment


    • #3
      Hi Daniel,

      Thank you for your response and sorry about the multiple threads - I thought that because I was getting a different error message it would be easier to start a new one but I understand that this makes things more confusing than anything else.

      I've tried again starting from scratch. Below is a summary of my variables. 3 of them do not have missing data: traj_alcohol, child_sex and pren_smoking. They are defined as "byte" variables and are categorical (traj_alcohol; coded as ranging from 0 to 5) or binomial. The other variables are defined as either "byte" when binomial/categorical or "float" or "double" when continuous.

      I've copied/pasted the code I've used and the last section of the output produced by Stata.
      Let me know is there is anything else that I should include in this post (I'm not sure how to provide example data).

      Thank you again

      Garance


      Code:
       summarize
      
          Variable |        Obs        Mean    Std. dev.       Min        Max
      -------------+---------------------------------------------------------
          study_id |        698    4737.775    1757.058       1004       7227
      mat_ment_h~h |        695    .6964844    .1715271   .1175857          1
      A6_axage_new |        696    7.361467    .5686854   5.946612   9.155373
      antidep_co~d |        696    .0517241    .2216288          0          1
               fad |        696    1.540513    .4439812          1          4
      -------------+---------------------------------------------------------
      A6_PAE_tra~w |          0
      A6_PAE_tie~w |          0
         ethnicity |        696    .1508621    .3581718          0          1
      traj_alcohol |        698     1.93553    1.706641          0          5
                VS |        509    93.83104    16.52471          3        136
      -------------+---------------------------------------------------------
                CE |        401    10.01995    2.727563          1         17
                PL |        401    10.12469    2.497882          2         19
      pregnancy_~g |        695    .2258993    .4184743          0          1
      post_smoking |        625       .1792    .3838269          0          1
               SRI |        695    .1899281    .3925265          0          1
      -------------+---------------------------------------------------------
      preeclampsia |        693     .008658    .0927117          0          1
            parity |        690    .6434783    .6965555          0          2
           preterm |        690    .0478261    .2135529          0          1
               psy |        696    .1508621    .4538556          0          2
            folate |        692    1.972543    1.043537          0          4
      -------------+---------------------------------------------------------
               SGA |        686    .0670554    .2503004          0          1
         child_sex |        698    .4971347    .5003503          0          1
        breastfeed |        632     .806962    .9271819          0          3
         parenting |        681     1.28928    1.198045          0          3
      pren_smoking |        698    .1475645    .3549221          0          1
      -------------+---------------------------------------------------------
               BMI |        676    .6597633    .9219509          0          3
              diet |        691    .9392185    .7998616          0          2
                ax |        694    .4639769    .6780409          0          2
              drug |        669    .0358744    .1861162          0          1
       DASS_stress |        696    .3635057    .6943068          0          2
      -------------+---------------------------------------------------------
         DASS_depr |        696    .2672414     .614329          0          2
          DASS_anx |        696    .2557471    .6192093          0          2
             _mi_m |        698           0           0          0          0
            _mi_id |        698       349.5    201.6395          1        698
          _mi_miss |        698    .7664756    .4233763          0          1
      Code:
      mi set mlong
      
      mi register imputed post_smoking preeclampsia preterm pregnancy_planning DASS_stress DASS_anx DASS_depr parity parenting diet mat_ment_health folate fad ethnicity psy breastfeed antidep_combined SRI SGA BMI ax drug VS PL CE child_sex  pren_smoking traj_alcohol
      
      mi impute chained (regress) mat_ment_health parity folate fad VS PL CE (logit) ethnicity pregnancy_planning SRI preeclampsia preterm SGA drug DASS_stress DASS_depr DASS_anx antidep_combined post_smoking (mlogit) psy breastfeed parenting BMI diet ax = traj_alcohol child_sex pren_smoking, add(43) augment noisily showcommand
      Code:
      Iteration 299: log likelihood = -177.30696  (backed up)
      Iteration 300: log likelihood = -177.30696  (backed up)
      convergence not achieved
      
      Multinomial logistic regression                         Number of obs =    694
                                                              LR chi2(76)   = 852.63
                                                              Prob > chi2   = 0.0000
      Log likelihood = -177.30696                             Pseudo R2     = 0.7063
      
      ------------------------------------------------------------------------------------
                      ax | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
      -------------------+----------------------------------------------------------------
      0                  |  (base outcome)
      -------------------+----------------------------------------------------------------
      1                  |
                     fad |  -45.69108   464.1733    -0.10   0.922     -955.454    864.0718
                         |
               ethnicity |
                      0  |          0  (empty)
                      1  |  -77.63112    1094.25    -0.07   0.943    -2222.321    2067.059
                         |
             DASS_stress |
                      0  |          0  (empty)
                      1  |   97.42973   1082.395     0.09   0.928    -2024.026    2218.885
                      2  |   93.21602   2066.397     0.05   0.964    -3956.847    4143.279
                         |
               DASS_depr |
                      0  |          0  (empty)
                      1  |   -56.0996   943.9529    -0.06   0.953    -1906.213    1794.014
                      2  |  -1.329389   2775.112    -0.00   1.000    -5440.448     5437.79
                         |
                DASS_anx |
                      0  |          0  (empty)
                      1  |  -45.32393   865.8734    -0.05   0.958    -1742.405    1651.757
                      2  |   55.97167   713.0152     0.08   0.937    -1341.513    1453.456
                         |
        antidep_combined |
                      0  |          0  (empty)
                      1  |   55.03474   2758.571     0.02   0.984    -5351.665    5461.735
                         |
                     psy |
                      0  |          0  (empty)
                      1  |  -61.74583     768.09    -0.08   0.936    -1567.175    1443.683
                      2  |  -61.31262   33831.79    -0.00   0.999    -66370.41    66247.78
                         |
         mat_ment_health |   -42.4544   1907.956    -0.02   0.982    -3781.979     3697.07
                         |
      pregnancy_planning |
                      0  |          0  (empty)
                      1  |  -6.840776   332.4919    -0.02   0.984    -658.5129    644.8313
                         |
                     SRI |
                      0  |          0  (empty)
                      1  |   -14.9186   471.0626    -0.03   0.975    -938.1843    908.3471
                         |
            preeclampsia |
                      0  |          0  (empty)
                      1  |  -166.4043   2791.749    -0.06   0.952    -5638.132    5305.323
                         |
                  folate |  -12.41935   189.9506    -0.07   0.948    -384.7156    359.8769
                         |
                    diet |
                      0  |          0  (empty)
                      1  |  -.3725422   508.9692    -0.00   0.999    -997.9339    997.1888
                      2  |   -67.5187    536.874    -0.13   0.900    -1119.772     984.735
                         |
                  parity |  -45.30268   438.2203    -0.10   0.918    -904.1986    813.5932
                         |
                 preterm |
                      0  |          0  (empty)
                      1  |  -73.34512   716.0278    -0.10   0.918    -1476.734    1330.044
                         |
                     SGA |
                      0  |          0  (empty)
                      1  |   71.83496   757.7226     0.09   0.924    -1413.274    1556.944
                         |
               parenting |
                      0  |          0  (empty)
                      1  |   1.281242   405.9603     0.00   0.997    -794.3863    796.9487
                      2  |   21.31427   870.3756     0.02   0.980    -1684.591    1727.219
                      3  |  -10.13924   797.5294    -0.01   0.990    -1573.268     1552.99
                         |
                     BMI |
                      0  |          0  (empty)
                      1  |    79.2376   550.4576     0.14   0.886    -999.6395    1158.115
                      2  |   33.00154   430.7382     0.08   0.939    -811.2298    877.2329
                      3  |  -30.56415   650.1648    -0.05   0.963    -1304.864    1243.735
                         |
                    drug |
                      0  |          0  (empty)
                      1  |  -47.75518   1175.688    -0.04   0.968    -2352.061    2256.551
                         |
              breastfeed |
                      0  |          0  (empty)
                      1  |   11.21488   529.7254     0.02   0.983    -1027.028    1049.458
                      2  |   17.73553   356.3998     0.05   0.960    -680.7953    716.2664
                      3  |   14.20757   909.9727     0.02   0.988    -1769.306    1797.721
                         |
            post_smoking |
                      0  |          0  (empty)
                      1  |   13.33993   852.8993     0.02   0.988    -1658.312    1684.992
                         |
                      VS |   8.839765   39.54212     0.22   0.823    -68.66136    86.34089
                      PL |  -9.171048   94.89695    -0.10   0.923    -195.1656    176.8236
                      CE |   5.068001   55.21681     0.09   0.927     -103.155     113.291
            traj_alcohol |  -17.79747   174.1776    -0.10   0.919    -359.1792    323.5843
               child_sex |  -43.41618   377.6009    -0.11   0.908    -783.5003    696.6679
            pren_smoking |  -7.241061   536.5934    -0.01   0.989    -1058.945    1044.463
                   _cons |  -860.3116   5056.164    -0.17   0.865    -10770.21    9049.587
      -------------------+----------------------------------------------------------------
      2                  |
                     fad |    .469965   .3652808     1.29   0.198    -.2459722    1.185902
                         |
               ethnicity |
                      0  |          0  (empty)
                      1  |   -.595336   .4597467    -1.29   0.195    -1.496423     .305751
                         |
             DASS_stress |
                      0  |          0  (empty)
                      1  |  -.2082667   .4994663    -0.42   0.677    -1.187203    .7706693
                      2  |   .0210388   .4897113     0.04   0.966    -.9387777    .9808554
                         |
               DASS_depr |
                      0  |          0  (empty)
                      1  |   .0072991   .5856988     0.01   0.990    -1.140649    1.155248
                      2  |  -.7070308   .6370675    -1.11   0.267     -1.95566    .5415986
                         |
                DASS_anx |
                      0  |          0  (empty)
                      1  |   .4613493   .5998554     0.77   0.442    -.7143456    1.637044
                      2  |    .609801   .5105257     1.19   0.232     -.390811    1.610413
                         |
        antidep_combined |
                      0  |          0  (empty)
                      1  |  -.0470938   .6896793    -0.07   0.946     -1.39884    1.304653
                         |
                     psy |
                      0  |          0  (empty)
                      1  |   1.213386    .482304     2.52   0.012     .2680877    2.158684
                      2  |  -.0803428   .8506425    -0.09   0.925    -1.747571    1.586886
                         |
         mat_ment_health |  -.7970812   .9625705    -0.83   0.408    -2.683685    1.089522
                         |
      pregnancy_planning |
                      0  |          0  (empty)
                      1  |  -.4848973   .3840102    -1.26   0.207    -1.237543     .267749
                         |
                     SRI |
                      0  |          0  (empty)
                      1  |  -1.586173   .5420918    -2.93   0.003    -2.648653   -.5236925
                         |
            preeclampsia |
                      0  |          0  (empty)
                      1  |   2.110037   1.118974     1.89   0.059     -.083112    4.303186
                         |
                  folate |  -.2796827   .1370871    -2.04   0.041    -.5483684   -.0109969
                         |
                    diet |
                      0  |          0  (empty)
                      1  |   .8278913    .359691     2.30   0.021     .1229099    1.532873
                      2  |    .545538   .3676865     1.48   0.138    -.1751144     1.26619
                         |
                  parity |  -.1966318   .2233375    -0.88   0.379    -.6343651    .2411016
                         |
                 preterm |
                      0  |          0  (empty)
                      1  |  -.1607845   .6681881    -0.24   0.810    -1.470409     1.14884
                         |
                     SGA |
                      0  |          0  (empty)
                      1  |   .8778893   .4990364     1.76   0.079    -.1002041    1.855983
                         |
               parenting |
                      0  |          0  (empty)
                      1  |   .7186037    .426911     1.68   0.092    -.1181265    1.555334
                      2  |   .2845771   .4267749     0.67   0.505    -.5518862    1.121041
                      3  |   .7510094   .3888689     1.93   0.053    -.0111596    1.513178
                         |
                     BMI |
                      0  |          0  (empty)
                      1  |   .0631047   .4510534     0.14   0.889    -.8209437    .9471531
                      2  |  -.1353048   .3954011    -0.34   0.732    -.9102767    .6396671
                      3  |  -.1547003   .8551669    -0.18   0.856    -1.830797    1.521396
                         |
                    drug |
                      0  |          0  (empty)
                      1  |   -.734997    1.14644    -0.64   0.521    -2.981979    1.511985
                         |
              breastfeed |
                      0  |          0  (empty)
                      1  |  -.9496476   .4813425    -1.97   0.049    -1.893062   -.0062337
                      2  |   .1847948   .3288222     0.56   0.574    -.4596849    .8292745
                      3  |  -.7389531   1.140616    -0.65   0.517    -2.974519    1.496613
                         |
            post_smoking |
                      0  |          0  (empty)
                      1  |  -.1766466    .442093    -0.40   0.689    -1.043133    .6898397
                         |
                      VS |  -.0297503    .009328    -3.19   0.001    -.0480327   -.0114678
                      PL |  -.0087076   .0574177    -0.15   0.879    -.1212442     .103829
                      CE |   .1265436   .0566181     2.24   0.025     .0155741    .2375132
            traj_alcohol |   .1033237   .0905378     1.14   0.254    -.0741271    .2807746
               child_sex |   .2898055   .2994752     0.97   0.333    -.2971552    .8767662
            pren_smoking |   .5076799   .4844453     1.05   0.295    -.4418154    1.457175
                   _cons |  -.7272601   1.475501    -0.49   0.622    -3.619189    2.164669
      ------------------------------------------------------------------------------------
      Note: 165 observations completely determined. Standard errors questionable.
      convergence not achieved
      mlogit failed to converge on observed data
      error occurred during imputation of mat_ment_health parity folate fad VS PL CE ethnicity pregnancy_planning SRI
      preeclampsia preterm SGA drug DASS_stress DASS_depr DASS_anx antidep_combined post_smoking psy breastfeed parenting BMI
      diet ax on m = 2
      r(430);

      Comment


      • #4
        Thanks for posting the details.

        You do not need to register traj_alcohol, child_sex, and pren_smoking imputed if neither has missing values; usually, it does no harm either.

        Your output suggests that you do no longer get missing imputed values, which is what I would expect. That leaves you with the problem of non-convergence. You first want to

        Code:
        set maxiter 50
        before starting the imputation. That does not solve anything but there is probably no need to let the models iterate 300 times until they finally fail. In my experience, if the model does not converge within 50 iterations, it will not converge within 300 iterations either.

        How can you fix the problem? Basically, you have two options: you can ignore occasional non-convergence or you can modify the respective model. I have discussed the former suggestion here, where I also touch on the alternative and sketch some pros and cons.

        Concerning the latter option -- modify the respective model -- I usually look for missing or utterly large standard errors. For example, look at the variable psy

        Code:
                       psy |
                        0  |          0  (empty)
                        1  |  -61.74583     768.09    -0.08   0.936    -1567.175    1443.683
                        2  |  -61.31262   33831.79    -0.00   0.999    -66370.41    66247.78
        You can remove that variable from the respective model

        Code:
        mi impute chained                         ///
            (regress) mat_ment_health ...  ///
            (logit) ethnicity ...          ///
            (mlogit) psy ...               ///
            (mlogit , omit(i.psy)) ax             /// <- separate model for -ax-
            = traj_alcohol child_sex pren_smoking ///
            , add(43) augment noisily showcommand
        Ideally, the variables that you exclude should not be highly predictive or theoretically "important" for predicting the respective outcome (or missing thereof). Also, you might want to consider how many missing values you want to impute and how much an improper model for the respective variable might jeopardize the quality of the imputations. Anyway, omitting predictors with high standard errors sometimes leads to more stable models that converge.

        Comment


        • #5
          Thank you, Daniel, for suggesting these options. I tried to adjust the model but even after removing several variables with a large standard error or number of missing observations, I still got the same error. However, it worked using mimpt. My model still failed to converge 2 times though. Is that alright (see output below)? Is there a maximum number of failures to converge that is ok to ignore? If not, is it better to include all of my variables and have 7 failures to converge, or to ignore those like psy and have 2 failures to converge? I think my model failed to converge on my dependent variables as well (CE, VS, PL). I'm not sure how that can affect the results.

          Also, I understand that I need to register all of my variables. But should I look for correlations with my DVs and IV (to know which covariates to include in my regressions analyses) before or after the imputation step? i.e., Should I 1. look for correlations with my DVs and IV, 2. perform multiple imputations with all of my variables (except those like psy, with a very large standard error or a large number of missing data), and 3. perform my multivariate regression analysis; or should I first do the multiple imputations, then the correlations and then the regression?

          Thank you

          Code:
          Running regress on data from iteration 10, m=10:
          
          
                Source |       SS           df       MS      Number of obs   =       401
          -------------+----------------------------------   F(39, 361)      =      1.96
                 Model |  520.540126        39  13.3471827   Prob > F        =    0.0008
              Residual |  2455.30027       361   6.8013858   R-squared       =    0.1749
          -------------+----------------------------------   Adj R-squared   =    0.0858
                 Total |   2975.8404       400    7.439601   Root MSE        =    2.6079
          
          --------------------------------------------------------------------------------------
                            CE | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
          ---------------------+----------------------------------------------------------------
                           fad |   .2787211   .3647704     0.76   0.445    -.4386207    .9960628
                   1.ethnicity |   .8784765   .4054674     2.17   0.031     .0811017    1.675851
                               |
                   DASS_stress |
                            1  |  -.7265986   .4526759    -1.61   0.109    -1.616812    .1636144
                            2  |   .6261188   .5337569     1.17   0.242    -.4235445    1.675782
                               |
                     DASS_depr |
                            1  |   .0991483   .5270657     0.19   0.851    -.9373565    1.135653
                            2  |  -.6370318   .5905186    -1.08   0.281     -1.79832    .5242568
                               |
                      DASS_anx |
                            1  |  -.0894158   .5671616    -0.16   0.875    -1.204771     1.02594
                            2  |  -.4562652   .5641245    -0.81   0.419    -1.565648    .6531179
                               |
            1.antidep_combined |   .1850992   .5814717     0.32   0.750    -.9583981    1.328597
                               |
                           psy |
                            1  |  -.1079576   .5396805    -0.20   0.842     -1.16927     .953355
                            2  |  -1.838186   .8529913    -2.15   0.032    -3.515643   -.1607303
                               |
               mat_ment_health |  -1.654755   .9295125    -1.78   0.076    -3.482694    .1731846
          1.pregnancy_planning |   .1867656   .3469064     0.54   0.591    -.4954456    .8689767
                         1.SRI |  -.3186368   .3772842    -0.84   0.399    -1.060588    .4233142
                               |
                            ax |
                            1  |   .3953724   .3114331     1.27   0.205    -.2170785    1.007823
                            2  |   .3844634   .4003786     0.96   0.338    -.4029039    1.171831
                               |
                1.preeclampsia |   .1124028   1.396588     0.08   0.936    -2.634068    2.858873
                        folate |  -.0790367   .1350862    -0.59   0.559    -.3446915    .1866181
                               |
                          diet |
                            1  |  -.7275144   .3307711    -2.20   0.028    -1.377995   -.0770341
                            2  |  -.4102278    .372949    -1.10   0.272    -1.143653    .3231977
                               |
                        parity |  -.1815686   .2060767    -0.88   0.379    -.5868302     .223693
                     1.preterm |   -.822952    .654348    -1.26   0.209    -2.109765    .4638606
                         1.SGA |  -.9861256    .522712    -1.89   0.060    -2.014069    .0418174
                               |
                     parenting |
                            1  |   .7538503   .3883015     1.94   0.053    -.0097668    1.517467
                            2  |  -.1937322   .4019139    -0.48   0.630    -.9841187    .5966544
                            3  |   .1568092   .3568492     0.44   0.661    -.5449552    .8585736
                               |
                           BMI |
                            1  |   .5578039   .4208928     1.33   0.186    -.2699058    1.385514
                            2  |   .2614018   .3525681     0.74   0.459    -.4319435    .9547472
                            3  |  -.6202437   .7468866    -0.83   0.407    -2.089039    .8485515
                               |
                        1.drug |    .969311   .7011484     1.38   0.168    -.4095375    2.348159
                               |
                    breastfeed |
                            1  |   .6967892   .4042262     1.72   0.086    -.0981447    1.491723
                            2  |  -.1125194   .3343404    -0.34   0.737    -.7700189      .54498
                            3  |  -.7492179   .8652843    -0.87   0.387    -2.450849     .952413
                               |
                1.post_smoking |   .4049712   .3930521     1.03   0.304    -.3679883    1.177931
                            VS |   .0169807   .0088548     1.92   0.056    -.0004327    .0343942
                            PL |   .2463846    .054584     4.51   0.000      .139042    .3537272
                  traj_alcohol |  -.0337894   .0876036    -0.39   0.700    -.2060669     .138488
                     child_sex |  -.1800929   .2785201    -0.65   0.518    -.7278186    .3676329
                  pren_smoking |  -1.112898   .4713662    -2.36   0.019    -2.039867   -.1859296
                         _cons |   7.086808   1.434553     4.94   0.000     4.265677     9.90794
          --------------------------------------------------------------------------------------
          
          Multivariate imputation                     Imputations =       10
          Chained equations                                 added =        1
          Imputed: m=10                                   updated =        0
          
          Initialization: monotone                     Iterations =       10
                                                          burn-in =       10
          
              mat_ment_hea~h: linear regression
                      parity: linear regression
                      folate: linear regression
                         fad: linear regression
                          VS: linear regression
                          PL: linear regression
                          CE: linear regression
                   ethnicity: logistic regression
              pregnancy_pl~g: logistic regression
                         SRI: logistic regression
                preeclampsia: augmented logistic regression
                     preterm: augmented logistic regression
                         SGA: augmented logistic regression
                        drug: augmented logistic regression
                 DASS_stress: logistic regression
                   DASS_depr: logistic regression
                    DASS_anx: augmented logistic regression
              antidep_comb~d: augmented logistic regression
                post_smoking: logistic regression
                         psy: augmented multinomial logistic regression
                  breastfeed: augmented multinomial logistic regression
                   parenting: multinomial logistic regression
                         BMI: augmented multinomial logistic regression
                        diet: multinomial logistic regression
                          ax: multinomial logistic regression
          
          ------------------------------------------------------------------
                             |               Observations per m             
                             |----------------------------------------------
                    Variable |   Complete   Incomplete   Imputed |     Total
          -------------------+-----------------------------------+----------
              mat_ment_hea~h |        695            3         3 |       698
                      parity |        690            8         8 |       698
                      folate |        692            6         6 |       698
                         fad |        696            2         2 |       698
                          VS |        509          189       189 |       698
                          PL |        401          297       297 |       698
                          CE |        401          297       297 |       698
                   ethnicity |        696            2         2 |       698
              pregnancy_pl~g |        695            3         3 |       698
                         SRI |        695            3         3 |       698
                preeclampsia |        693            5         5 |       698
                     preterm |        690            8         8 |       698
                         SGA |        686           12        12 |       698
                        drug |        669           29        29 |       698
                 DASS_stress |        696            2         2 |       698
                   DASS_depr |        696            2         2 |       698
                    DASS_anx |        696            2         2 |       698
              antidep_comb~d |        696            2         2 |       698
                post_smoking |        625           73        73 |       698
                         psy |        696            2         2 |       698
                  breastfeed |        632           66        66 |       698
                   parenting |        681           17        17 |       698
                         BMI |        676           22        22 |       698
                        diet |        691            7         7 |       698
                          ax |        694            4         4 |       698
          ------------------------------------------------------------------
          (Complete + Incomplete = Total; Imputed is the minimum across m
           of the number of filled-in observations.)
          
          Warning: the sets of predictors of the imputation model vary across
                   imputations or iterations
          
          Warning: the imputation model failed to converge 2 times

          Comment


          • #6
            Originally posted by Garance Delagneau View Post
            My model still failed to converge 2 times though. Is that alright (see output below)? Is there a maximum number of failures to converge that is ok to ignore? If not, is it better to include all of my variables and have 7 failures to converge, or to ignore those like psy and have 2 failures to converge? I think my model failed to converge on my dependent variables as well (CE, VS, PL). I'm not sure how that can affect the results.
            I am not aware of any studies (simulation or other) that investigate this specific problem. Generally speaking, omitting variables will lead to biased correlations. Thus, I tend to prefer ignoring some non-convergence problems to omitting variables from the model(s). However, if the non-convergence indicates some systematic problem with the respective model, that problem should probably be addressed.

            Your output suggests that you have run the model(s) in question at least 10 (iterations) * 10 (imputations) = 100 times. It is up to you to decide whether 2 failures (or 7 failures) in 100 trials indicates a systematic problem and if it is more of a problem than omitting predictors from the model 100 out of 100 trials.


            Originally posted by Garance Delagneau View Post
            Also, I understand that I need to register all of my variables.
            It's good practice. Technically, you only need to register the imputed variables.


            Originally posted by Garance Delagneau View Post
            But should I look for correlations with my DVs and IV (to know which covariates to include in my regressions analyses) before or after the imputation step? i.e., Should I 1. look for correlations with my DVs and IV, 2. perform multiple imputations with all of my variables (except those like psy, with a very large standard error or a large number of missing data), and 3. perform my multivariate regression analysis; or should I first do the multiple imputations, then the correlations and then the regression?
            A lot depends on your goals and you do not say anything about that. If your goal is prediction, then it might be reasonable to select predictors on the basis of correlations. That is basically true for the imputation model because the imputation model is about prediction. Often, the analyses model is not about prediction but about estimating the (causal) "effects" of selected predictors. In that case, the decision of which variables enter the model and which do not should be mainly based on theoretical considerations.
            Last edited by daniel klein; 09 Jul 2021, 23:36.

            Comment


            • #7
              Fantastic. Thank you so much for your help! This was extremely helpful and lifesaving I'll try and go onto the next step now which is doing my regression analysis!

              Comment


              • #8
                Sorry, me again...
                I had a look at the descriptive statistics of my variables post imputation. My DVs now range from negative to positive, which is impossible given the nature of the task they had to do. Values can only be positive. Also, everything is associated with everything, which is not expected. So I must have done something wrong. Is there anything that you see that might be the issue?
                Below are: the code used, the last section of the obtained output and the summary of my data pre- and post-imputation in another post.
                Thank you again


                Code used:
                Code:
                mi set mlong
                
                mi register imputed post_smoking preeclampsia preterm pregnancy_planning DASS_stress DASS_anx DASS_depr parity parenting diet mat_ment_health folate fad ethnicity psy breastfeed antidep_combined SRI SGA BMI ax drug VS PL CE child_sex  pren_smoking traj_alcohol
                
                mimpt chained (regress) mat_ment_health parity folate fad VS PL CE (logit) ethnicity pregnancy_planning SRI preeclampsia preterm SGA drug DASS_stress DASS_depr DASS_anx antidep_combined (logit, omit(i.ethnicity)) post_smoking (mlogit) psy breastfeed parenting BMI diet (mlogit, omit(i.psy)) ax = traj_alcohol child_sex pren_smoking, add(43) augment noisily skipnonconvergence(43)
                Last section of the output obtained:
                Code:
                Running regress on data from iteration 10, m=43:
                
                
                      Source |       SS           df       MS      Number of obs   =       509
                -------------+----------------------------------   F(38, 470)      =      2.80
                       Model |  25610.7907        38  673.968176   Prob > F        =    0.0000
                    Residual |  113106.679       470  240.652508   R-squared       =    0.1846
                -------------+----------------------------------   Adj R-squared   =    0.1187
                       Total |   138717.47       508  273.065885   Root MSE        =    15.513
                
                --------------------------------------------------------------------------------------
                                  VS | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
                ---------------------+----------------------------------------------------------------
                                 fad |   1.539368   1.873641     0.82   0.412    -2.142382    5.221118
                         1.ethnicity |    1.69814   2.045044     0.83   0.407    -2.320421      5.7167
                                     |
                         DASS_stress |
                                  1  |   5.294449    2.53807     2.09   0.038       .30708    10.28182
                                  2  |  -.0715786    2.66152    -0.03   0.979    -5.301529    5.158372
                                     |
                           DASS_depr |
                                  1  |    2.18332   3.050558     0.72   0.475    -3.811101    8.177741
                                  2  |  -.9380816   3.175874    -0.30   0.768     -7.17875    5.302587
                                     |
                            DASS_anx |
                                  1  |  -1.442668   3.263236    -0.44   0.659    -7.855007     4.96967
                                  2  |  -5.026476   2.885128    -1.74   0.082    -10.69582    .6428712
                                     |
                  1.antidep_combined |   2.044714   3.206072     0.64   0.524    -4.255296    8.344724
                                     |
                                 psy |
                                  1  |  -1.544619   2.843531    -0.54   0.587    -7.132226    4.042988
                                  2  |  -.2320085   4.206144    -0.06   0.956    -8.497182    8.033165
                                     |
                     mat_ment_health |   7.606012   4.769386     1.59   0.111    -1.765948    16.97797
                1.pregnancy_planning |   1.073305   1.768266     0.61   0.544    -2.401381     4.54799
                               1.SRI |  -.9690538   2.013859    -0.48   0.631    -4.926335    2.988228
                                2.ax |  -6.204219   2.110974    -2.94   0.003    -10.35233   -2.056103
                      1.preeclampsia |  -4.687585   7.441413    -0.63   0.529    -19.31014    9.934972
                              folate |  -.8527972   .6965093    -1.22   0.221    -2.221455    .5158605
                                     |
                                diet |
                                  1  |   6.799543   1.743821     3.90   0.000     3.372893    10.22619
                                  2  |   2.867751   1.865695     1.54   0.125    -.7983843    6.533886
                                     |
                              parity |   -.779509   1.038111    -0.75   0.453    -2.819422    1.260404
                           1.preterm |  -4.439071   3.146505    -1.41   0.159    -10.62203    1.743888
                               1.SGA |   .3896312   2.822581     0.14   0.890    -5.156809    5.936071
                                     |
                           parenting |
                                  1  |   1.410778    2.06217     0.68   0.494    -2.641437    5.462993
                                  2  |  -5.058426   2.147574    -2.36   0.019    -9.278461   -.8383923
                                  3  |  -.6735936   1.876761    -0.36   0.720    -4.361474    3.014287
                                     |
                                 BMI |
                                  1  |  -1.823205   2.229087    -0.82   0.414    -6.203414    2.557004
                                  2  |   -4.39691   1.957535    -2.25   0.025    -8.243514   -.5503055
                                  3  |   6.779849    3.47174     1.95   0.051    -.0422033     13.6019
                                     |
                              1.drug |   1.018922   3.829961     0.27   0.790    -6.507043    8.544888
                                     |
                          breastfeed |
                                  1  |  -.3149281   2.063411    -0.15   0.879     -4.36958    3.739724
                                  2  |   .5327689   1.820967     0.29   0.770    -3.045475    4.111013
                                  3  |   4.830896   4.019882     1.20   0.230    -3.068269    12.73006
                                     |
                      1.post_smoking |  -3.192069   2.041704    -1.56   0.119    -7.204067    .8199278
                                  PL |    .866554   .2963067     2.92   0.004     .2843041    1.448804
                                  CE |    1.15939   .2877811     4.03   0.000      .593893    1.724887
                        traj_alcohol |  -.0118278   .4541943    -0.03   0.979    -.9043305     .880675
                           child_sex |   1.399228   1.445403     0.97   0.334    -1.441024    4.239481
                        pren_smoking |   7.190041   2.341195     3.07   0.002     2.589536    11.79055
                               _cons |   65.91119   6.647417     9.92   0.000     52.84886    78.97353
                --------------------------------------------------------------------------------------
                
                Running regress on data from iteration 10, m=43:
                
                
                      Source |       SS           df       MS      Number of obs   =       401
                -------------+----------------------------------   F(39, 361)      =      1.56
                       Model |  360.535947        39  9.24451146   Prob > F        =    0.0202
                    Residual |  2135.22964       361  5.91476354   R-squared       =    0.1445
                -------------+----------------------------------   Adj R-squared   =    0.0520
                       Total |  2495.76559       400  6.23941397   Root MSE        =     2.432
                
                --------------------------------------------------------------------------------------
                                  PL | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
                ---------------------+----------------------------------------------------------------
                                 fad |   .0631205    .342124     0.18   0.854    -.6096859    .7359269
                         1.ethnicity |  -.2463463   .3804097    -0.65   0.518    -.9944437    .5017511
                                     |
                         DASS_stress |
                                  1  |   .4088814   .4226575     0.97   0.334    -.4222986    1.240061
                                  2  |  -.1545485    .496909    -0.31   0.756    -1.131748    .8226514
                                     |
                           DASS_depr |
                                  1  |  -.9582598   .4933707    -1.94   0.053    -1.928501    .0119818
                                  2  |  -.3361011   .5529248    -0.61   0.544    -1.423459    .7512571
                                     |
                            DASS_anx |
                                  1  |   .0930699   .5342681     0.17   0.862    -.9575988    1.143739
                                  2  |    .216511   .5224002     0.41   0.679    -.8108188    1.243841
                                     |
                  1.antidep_combined |   .8262397   .5450307     1.52   0.130    -.2455942    1.898074
                                     |
                                 psy |
                                  1  |  -.4529424   .5023535    -0.90   0.368    -1.440849    .5349644
                                  2  |  -.2425259   .8045758    -0.30   0.763     -1.82477    1.339718
                                     |
                     mat_ment_health |   .0831321   .8661847     0.10   0.924     -1.62027    1.786534
                1.pregnancy_planning |  -.4912887   .3239702    -1.52   0.130    -1.128395    .1458172
                               1.SRI |   .0795936   .3481129     0.23   0.819    -.6049903    .7641775
                                     |
                                  ax |
                                  1  |  -.5893315   .2855065    -2.06   0.040    -1.150796   -.0278667
                                  2  |  -.3708259   .3732616    -0.99   0.321    -1.104866    .3632144
                                     |
                      1.preeclampsia |   .5581146   1.297085     0.43   0.667    -1.992678    3.108907
                              folate |   .0459959   .1263094     0.36   0.716    -.2023987    .2943906
                                     |
                                diet |
                                  1  |  -.1387859   .3072319    -0.45   0.652     -.742975    .4654032
                                  2  |  -.3352147   .3448425    -0.97   0.332    -1.013367    .3429377
                                     |
                              parity |   -.022789   .1927298    -0.12   0.906    -.4018033    .3562252
                           1.preterm |  -.1151393   .6040833    -0.19   0.849    -1.303104    1.072825
                               1.SGA |  -.1363162   .5042282    -0.27   0.787     -1.12791    .8552773
                                     |
                           parenting |
                                  1  |    .380623   .3671385     1.04   0.301    -.3413759    1.102622
                                  2  |   .7850842    .375851     2.09   0.037     .0459518    1.524217
                                  3  |   .3778941   .3317218     1.14   0.255    -.2744557    1.030244
                                     |
                                 BMI |
                                  1  |  -.8837415   .3923884    -2.25   0.025    -1.655396   -.1120873
                                  2  |  -.5549946   .3253735    -1.71   0.089     -1.19486    .0848709
                                  3  |   .5739108   .7007689     0.82   0.413    -.8041912    1.952013
                                     |
                              1.drug |   .2350249   .6364873     0.37   0.712    -1.016664    1.486713
                                     |
                          breastfeed |
                                  1  |   -.523364   .3761488    -1.39   0.165    -1.263082    .2163542
                                  2  |   .2358689    .320383     0.74   0.462    -.3941825    .8659204
                                  3  |   .2170943   .7516913     0.29   0.773     -1.26115    1.695338
                                     |
                      1.post_smoking |   .2013152   .3870254     0.52   0.603    -.5597924    .9624229
                                  VS |   .0133862   .0083729     1.60   0.111    -.0030796     .029852
                                  CE |   .2072466   .0485007     4.27   0.000     .1118673     .302626
                        traj_alcohol |  -.0546207   .0817641    -0.67   0.505    -.2154145    .1061731
                           child_sex |  -.0516053   .2605124    -0.20   0.843    -.5639179    .4607072
                        pren_smoking |   .0404635   .4515285     0.09   0.929    -.8474931    .9284201
                               _cons |   7.231842   1.300065     5.56   0.000     4.675189    9.788495
                --------------------------------------------------------------------------------------
                
                Running regress on data from iteration 10, m=43:
                
                
                      Source |       SS           df       MS      Number of obs   =       401
                -------------+----------------------------------   F(39, 361)      =      2.25
                       Model |  582.452451        39  14.9346782   Prob > F        =    0.0001
                    Residual |  2393.38795       361  6.62988351   R-squared       =    0.1957
                -------------+----------------------------------   Adj R-squared   =    0.1088
                       Total |   2975.8404       400    7.439601   Root MSE        =    2.5749
                
                --------------------------------------------------------------------------------------
                                  CE | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
                ---------------------+----------------------------------------------------------------
                                 fad |   .2573029     .36198     0.71   0.478    -.4545514    .9691573
                         1.ethnicity |   .8541076    .400469     2.13   0.034     .0665624    1.641653
                                     |
                         DASS_stress |
                                  1  |  -.7892968   .4461289    -1.77   0.078    -1.666635    .0880411
                                  2  |   .7260578   .5247723     1.38   0.167    -.3059368    1.758052
                                     |
                           DASS_depr |
                                  1  |  -.0090079   .5250671    -0.02   0.986    -1.041582    1.023567
                                  2  |  -.7068735   .5845135    -1.21   0.227    -1.856353    .4426057
                                     |
                            DASS_anx |
                                  1  |  -.0459069    .565663    -0.08   0.935    -1.158316    1.066502
                                  2  |  -.4543039   .5526942    -0.82   0.412    -1.541208    .6326008
                                     |
                  1.antidep_combined |  -.0901437   .5788534    -0.16   0.876    -1.228492    1.048205
                                     |
                                 psy |
                                  1  |  -.1312223   .5324093    -0.25   0.805    -1.178236    .9157909
                                  2  |  -2.022253   .8452592    -2.39   0.017    -3.684503   -.3600025
                                     |
                     mat_ment_health |  -1.501619   .9136536    -1.64   0.101    -3.298371    .2951329
                1.pregnancy_planning |   .1326827   .3440161     0.39   0.700    -.5438445      .80921
                               1.SRI |  -.3079838   .3682268    -0.84   0.403    -1.032123    .4161553
                                     |
                                  ax |
                                  1  |   .5031533   .3028968     1.66   0.098    -.0925105    1.098817
                                  2  |   .4836481   .3949027     1.22   0.221    -.2929506    1.260247
                                     |
                      1.preeclampsia |   .3223942   1.373507     0.23   0.815    -2.378686    3.023475
                              folate |  -.0798887   .1336857    -0.60   0.550    -.3427893    .1830118
                                     |
                                diet |
                                  1  |  -.7292173   .3230953    -2.26   0.025    -1.364603    -.093832
                                  2  |  -.2549237   .3653255    -0.70   0.486    -.9733571    .4635098
                                     |
                              parity |  -.1704802    .203855    -0.84   0.404    -.5713726    .2304123
                           1.preterm |  -.7913583   .6382343    -1.24   0.216    -2.046483    .4637659
                               1.SGA |  -1.073663   .5308955    -2.02   0.044    -2.117699   -.0296266
                                     |
                           parenting |
                                  1  |   .5334244   .3882642     1.37   0.170    -.2301193    1.296968
                                  2  |  -.1195925   .4002718    -0.30   0.765      -.90675    .6675649
                                  3  |   .1102909   .3517858     0.31   0.754     -.581516    .8020979
                                     |
                                 BMI |
                                  1  |   .5110505   .4174753     1.22   0.222    -.3099385    1.332039
                                  2  |   .1918733   .3457198     0.55   0.579    -.4880044    .8717511
                                  3  |  -.6858662   .7417344    -0.92   0.356    -2.144529    .7727968
                                     |
                              1.drug |    1.11928   .6714145     1.67   0.096    -.2010947    2.439655
                                     |
                          breastfeed |
                                  1  |    .549952   .3982551     1.38   0.168    -.2332394    1.333143
                                  2  |  -.4172577   .3387418    -1.23   0.219    -1.083413    .2488973
                                  3  |  -1.409903   .7924617    -1.78   0.076    -2.968324     .148518
                                     |
                      1.post_smoking |   .4950992    .409079     1.21   0.227     -.309378    1.299576
                                  VS |   .0268612   .0087829     3.06   0.002     .0095891    .0441333
                                  PL |   .2323036   .0543646     4.27   0.000     .1253925    .3392147
                        traj_alcohol |  -.0354704   .0865993    -0.41   0.682    -.2057729    .1348321
                           child_sex |  -.2267101   .2755685    -0.82   0.411    -.7686313     .315211
                        pren_smoking |  -1.182221   .4739844    -2.49   0.013    -2.114338   -.2501033
                               _cons |   6.392502   1.394171     4.59   0.000     3.650786    9.134218
                --------------------------------------------------------------------------------------
                
                Multivariate imputation                     Imputations =       43
                Chained equations                                 added =        1
                Imputed: m=43                                   updated =        0
                
                Initialization: monotone                     Iterations =       10
                                                                burn-in =       10
                
                    mat_ment_hea~h: linear regression
                            parity: linear regression
                            folate: linear regression
                               fad: linear regression
                                VS: linear regression
                                PL: linear regression
                                CE: linear regression
                         ethnicity: augmented logistic regression
                    pregnancy_pl~g: logistic regression
                               SRI: logistic regression
                      preeclampsia: augmented logistic regression
                           preterm: augmented logistic regression
                               SGA: augmented logistic regression
                              drug: augmented logistic regression
                       DASS_stress: logistic regression
                         DASS_depr: logistic regression
                          DASS_anx: augmented logistic regression
                    antidep_comb~d: augmented logistic regression
                      post_smoking: logistic regression
                               psy: augmented multinomial logistic regression
                        breastfeed: augmented multinomial logistic regression
                         parenting: multinomial logistic regression
                               BMI: augmented multinomial logistic regression
                              diet: multinomial logistic regression
                                ax: multinomial logistic regression
                
                ------------------------------------------------------------------
                                   |               Observations per m             
                                   |----------------------------------------------
                          Variable |   Complete   Incomplete   Imputed |     Total
                -------------------+-----------------------------------+----------
                    mat_ment_hea~h |        695            3         3 |       698
                            parity |        690            8         8 |       698
                            folate |        692            6         6 |       698
                               fad |        696            2         2 |       698
                                VS |        509          189       189 |       698
                                PL |        401          297       297 |       698
                                CE |        401          297       297 |       698
                         ethnicity |        696            2         2 |       698
                    pregnancy_pl~g |        695            3         3 |       698
                               SRI |        695            3         3 |       698
                      preeclampsia |        693            5         5 |       698
                           preterm |        690            8         8 |       698
                               SGA |        686           12        12 |       698
                              drug |        669           29        29 |       698
                       DASS_stress |        696            2         2 |       698
                         DASS_depr |        696            2         2 |       698
                          DASS_anx |        696            2         2 |       698
                    antidep_comb~d |        696            2         2 |       698
                      post_smoking |        625           73        73 |       698
                               psy |        696            2         2 |       698
                        breastfeed |        632           66        66 |       698
                         parenting |        681           17        17 |       698
                               BMI |        676           22        22 |       698
                              diet |        691            7         7 |       698
                                ax |        694            4         4 |       698
                ------------------------------------------------------------------
                (Complete + Incomplete = Total; Imputed is the minimum across m
                 of the number of filled-in observations.)
                
                Warning: the sets of predictors of the imputation model vary across
                         imputations or iterations
                
                Warning: the imputation model failed to converge 28 times
                Summary of data post-imputation:
                Code:
                 summarize
                
                    Variable |        Obs        Mean    Std. dev.       Min        Max
                -------------+---------------------------------------------------------
                    study_id |     23,703     4694.75    1737.668       1004       7227
                mat_ment_h~h |     23,700    .6942361    .1713965   .1175857   1.156003
                A6_axage_new |     23,615    7.338732    .5651791   5.946612   9.155373
                antidep_co~d |     23,701    .0527826    .2236039          0          1
                         fad |     23,701    1.549727    .4503913   .6066008          4
                -------------+---------------------------------------------------------
                A6_PAE_tra~w |          0
                A6_PAE_tie~w |          0
                   ethnicity |     23,701    .1666596    .3726796          0          1
                traj_alcohol |     23,703    1.916466    1.725126          0          5
                          VS |     23,514    98.51167    19.59191          3   210.9654
                -------------+---------------------------------------------------------
                          CE |     23,406    10.03296    2.943263  -2.024604   20.23841
                          PL |     23,406    10.18837    2.673664  -.5185956   21.11018
                pregnancy_~g |     23,700    .2317722    .4219731          0          1
                post_smoking |     23,630    .2181126     .412973          0          1
                         SRI |     23,700    .2101688    .4074369          0          1
                -------------+---------------------------------------------------------
                preeclampsia |     23,698    .0096211    .0976161          0          1
                      parity |     23,695    .6431332    .6878674  -1.270262   2.602207
                     preterm |     23,695    .0463811    .2103135          0          1
                         psy |     23,701    .1605417    .4773405          0          2
                      folate |     23,697    1.978108     1.03111  -1.707257   5.412296
                -------------+---------------------------------------------------------
                         SGA |     23,691    .0704909    .2559779          0          1
                   child_sex |     23,703    .4917521    .4999425          0          1
                  breastfeed |     23,637    .8613191    .9330573          0          3
                   parenting |     23,686    1.243435    1.196511          0          3
                pren_smoking |     23,703    .1567312    .3635548          0          1
                -------------+---------------------------------------------------------
                         BMI |     23,681    .6810945     .936579          0          3
                        diet |     23,696    .9220544    .7940758          0          2
                          ax |     23,699    .4171906    .5754799          0          2
                        drug |     23,674    .0436766    .2043789          0          1
                 DASS_stress |     23,701    .3660183    .6946386          0          2
                -------------+---------------------------------------------------------
                   DASS_depr |     23,701    .2738703    .6151158          0          2
                    DASS_anx |     23,701    .2518037    .6098086          0          2
                       _mi_m |     23,703    21.35215    12.77908          0         43
                      _mi_id |     23,703    347.3403      204.01          1        698
                    _mi_miss |        698    .7664756    .4233763          0          1
                Summary of data pre-imputation:
                Code:
                . summarize
                
                    Variable |        Obs        Mean    Std. dev.       Min        Max
                -------------+---------------------------------------------------------
                    study_id |        698    4737.775    1757.058       1004       7227
                mat_ment_h~h |        695    .6964844    .1715271   .1175857          1
                A6_axage_new |        696    7.361467    .5686854   5.946612   9.155373
                antidep_co~d |        696    .0517241    .2216288          0          1
                         fad |        696    1.540513    .4439812          1          4
                -------------+---------------------------------------------------------
                A6_PAE_tra~w |          0
                A6_PAE_tie~w |          0
                   ethnicity |        696    .1508621    .3581718          0          1
                traj_alcohol |        698     1.93553    1.706641          0          5
                          VS |        509    93.83104    16.52471          3        136
                -------------+---------------------------------------------------------
                          CE |        401    10.01995    2.727563          1         17
                          PL |        401    10.12469    2.497882          2         19
                pregnancy_~g |        695    .2258993    .4184743          0          1
                post_smoking |        625       .1792    .3838269          0          1
                         SRI |        695    .1899281    .3925265          0          1
                -------------+---------------------------------------------------------
                preeclampsia |        693     .008658    .0927117          0          1
                      parity |        690    .6434783    .6965555          0          2
                     preterm |        690    .0478261    .2135529          0          1
                         psy |        696    .1508621    .4538556          0          2
                      folate |        692    1.972543    1.043537          0          4
                -------------+---------------------------------------------------------
                         SGA |        686    .0670554    .2503004          0          1
                   child_sex |        698    .4971347    .5003503          0          1
                  breastfeed |        632     .806962    .9271819          0          3
                   parenting |        681     1.28928    1.198045          0          3
                pren_smoking |        698    .1475645    .3549221          0          1
                -------------+---------------------------------------------------------
                         BMI |        676    .6597633    .9219509          0          3
                        diet |        691    .9392185    .7998616          0          2
                          ax |        694    .4639769    .6780409          0          2
                        drug |        669    .0358744    .1861162          0          1
                 DASS_stress |        696    .3635057    .6943068          0          2
                -------------+---------------------------------------------------------
                   DASS_depr |        696    .2672414     .614329          0          2
                    DASS_anx |        696    .2557471    .6192093          0          2

                Comment


                • #9
                  I know this post has moved on, but looking back at #3, I see another problem in the output shown there that I think has been overlooked and may be contributing to subsequent difficulties.

                  In the -mlogit- that failed to converge for imputation of variable ax, it has been remarked that some variables, like psy have unreasonably large standard errors. But I think the problem may not lie with psy (or there may be a problem with psy as well) but with ax. Further evidence that ax is a problem here is that several of the predictor variables in the -mlogit- have unreasonable coefficients and standard errors. But most tellingly, the constant in that model for outcome 1 is gargantuan and has an even more outlandish standard error. This all suggests to me that outcome 1 for variable ax is a rare event and the -mlogit- cannot find enough information in the data to estimate parameters for predicting outcome = 1 from the variables. My advice would be to either eliminate the variable ax from the model, or, because there is no apparent problem with outcome = 2, eliminate outcome = 1 either by dropping those observations, or by recoding ax to 0 or 2 (if that makes sense in terms of what ax = 0 and ax = 2 mean) when it is currently given as 1 (combining categories).

                  I should add that this is a problem that I have encountered any number of times using -mlogit- in MI. Rare (or at the other extreme, almost always) outcomes are difficult to estimate and will defeat attempts at MI. It is best to scan the data for such problems before even starting and regularize such variables before attempting MI.

                  Comment


                  • #10
                    Thanks, Clyde
                    How many discrepancies can I have between two categories before it creates an issue? Is there a rule of thumb? I've tried to combine categories and remove variables, but still get the same issue (although to a lesser extent). Below is the output I obtained after the imputation along with the highest discrepancies between the categories of my variables.

                    Code:
                    ing regress on data from iteration 10, m=10:
                    
                    
                          Source |       SS           df       MS      Number of obs   =       372
                    -------------+----------------------------------   F(27, 344)      =      1.70
                           Model |  270.167859        27   10.006217   Prob > F        =    0.0178
                        Residual |  2024.70042       344  5.88575704   R-squared       =    0.1177
                    -------------+----------------------------------   Adj R-squared   =    0.0485
                           Total |  2294.86828       371  6.18562879   Root MSE        =    2.4261
                    
                    --------------------------------------------------------------------------------------
                                      PL | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
                    ---------------------+----------------------------------------------------------------
                             1.ethnicity |  -.1644896   .4060614    -0.41   0.686    -.9631652     .634186
                                     fad |   .1440808   .3476684     0.41   0.679    -.5397426    .8279043
                           1.DASS_stress |   .0345298   .3714541     0.09   0.926    -.6960773    .7651369
                             1.DASS_depr |  -.5406795   .4203478    -1.29   0.199    -1.367455    .2860959
                              1.DASS_anx |   .1460424   .4155419     0.35   0.725    -.6712803    .9633651
                      1.antidep_combined |   .6945539   .5495353     1.26   0.207    -.3863184    1.775426
                         mat_ment_health |  -.1525394   .8691774    -0.18   0.861    -1.862111    1.557032
                    1.pregnancy_planning |  -.3422957   .3306488    -1.04   0.301    -.9926435    .3080521
                                   1.SRI |  -.1837065   .3569057    -0.51   0.607    -.8856986    .5182856
                                    1.ax |  -.6089911   .2755989    -2.21   0.028    -1.151062   -.0669201
                                  folate |   .0500318   .1290173     0.39   0.698    -.2037303    .3037938
                                         |
                                    diet |
                                      1  |  -.3192526   .3148365    -1.01   0.311    -.9384994    .2999943
                                      2  |  -.5128655    .351199    -1.46   0.145    -1.203633    .1779023
                                         |
                                  parity |  -.1076713   .1911572    -0.56   0.574    -.4836553    .2683127
                                         |
                               parenting |
                                      1  |   .4711658   .3788103     1.24   0.214    -.2739101    1.216242
                                      2  |   .5525683   .3874135     1.43   0.155    -.2094291    1.314566
                                      3  |   .2913754   .3344776     0.87   0.384    -.3665033    .9492541
                                         |
                                     BMI |
                                      1  |  -.7500518   .4004155    -1.87   0.062    -1.537623     .037519
                                      2  |  -.3252849   .3148728    -1.03   0.302    -.9446032    .2940333
                                         |
                              breastfeed |
                                      1  |  -.3605234   .3648243    -0.99   0.324    -1.078091    .3570437
                                      2  |   .0672053   .3217047     0.21   0.835    -.5655506    .6999611
                                         |
                          1.post_smoking |   .1786082   .3785186     0.47   0.637    -.5658939    .9231104
                                      VS |   .0126798   .0082473     1.54   0.125    -.0035418    .0289013
                                      CE |   .1996368   .0488639     4.09   0.000     .1035272    .2957463
                            traj_alcohol |  -.0336519   .0825625    -0.41   0.684    -.1960428     .128739
                               child_sex |   .0031046    .263045     0.01   0.991    -.5142744    .5204836
                            pren_smoking |  -.0563226   .4485183    -0.13   0.900    -.9385061     .825861
                                   _cons |   7.639255   1.399949     5.46   0.000     4.885719    10.39279
                    --------------------------------------------------------------------------------------
                    
                    Running regress on data from iteration 10, m=10:
                    
                    
                          Source |       SS           df       MS      Number of obs   =       372
                    -------------+----------------------------------   F(27, 344)      =      1.66
                           Model |  306.341574        27  11.3459842   Prob > F        =    0.0225
                        Residual |  2350.97832       344   6.8342393   R-squared       =    0.1153
                    -------------+----------------------------------   Adj R-squared   =    0.0458
                           Total |  2657.31989       371  7.16258731   Root MSE        =    2.6142
                    
                    --------------------------------------------------------------------------------------
                                      CE | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
                    ---------------------+----------------------------------------------------------------
                             1.ethnicity |   .4743399   .4369144     1.09   0.278    -.3850201      1.3337
                                     fad |   .2706447    .374445     0.72   0.470    -.4658451    1.007135
                           1.DASS_stress |  -.2883158   .3999694    -0.72   0.471    -1.075009    .4983776
                             1.DASS_depr |  -.0637588   .4540275    -0.14   0.888    -.9567782    .8292606
                              1.DASS_anx |  -.2030741   .4477204    -0.45   0.650    -1.083688      .67754
                      1.antidep_combined |   .1985199   .5934374     0.33   0.738    -.9687028    1.365743
                         mat_ment_health |  -1.521981   .9330365    -1.63   0.104    -3.357155     .313194
                    1.pregnancy_planning |   .0103924   .3568501     0.03   0.977    -.6914903     .712275
                                   1.SRI |  -.1525469   .3846496    -0.40   0.692    -.9091081    .6040143
                                    1.ax |   .6268241   .2971606     2.11   0.036     .0423437    1.211305
                                  folate |  -.0860259   .1389777    -0.62   0.536    -.3593788    .1873271
                                         |
                                    diet |
                                      1  |  -.5696736   .3383726    -1.68   0.093    -1.235213    .0958661
                                      2  |  -.2280496   .3794122    -0.60   0.548    -.9743094    .5182103
                                         |
                                  parity |  -.2020587   .2057913    -0.98   0.327    -.6068263    .2027089
                                         |
                               parenting |
                                      1  |   .6845809   .4074415     1.68   0.094    -.1168094    1.485971
                                      2  |   .1449433   .4186233     0.35   0.729    -.6784402    .9683267
                                      3  |      .1592    .360717     0.44   0.659    -.5502885    .8686885
                                         |
                                     BMI |
                                      1  |   .5186417   .4327666     1.20   0.232      -.33256    1.369843
                                      2  |   .0767112    .339797     0.23   0.822    -.5916301    .7450524
                                         |
                              breastfeed |
                                      1  |   .5269437   .3926534     1.34   0.180    -.2453601    1.299247
                                      2  |  -.1702599   .3465585    -0.49   0.624    -.8519004    .5113805
                                         |
                          1.post_smoking |  -.0523332    .408001    -0.13   0.898    -.8548239    .7501574
                                      VS |   .0057942    .008912     0.65   0.516    -.0117348    .0233231
                                      PL |    .231808   .0567382     4.09   0.000     .1202105    .3434055
                            traj_alcohol |  -.0537223   .0889409    -0.60   0.546    -.2286588    .1212142
                               child_sex |  -.3180163   .2829293    -1.12   0.262    -.8745055    .2384729
                            pren_smoking |  -.6958975   .4818607    -1.44   0.150    -1.643662    .2518666
                                   _cons |   8.146877   1.509877     5.40   0.000     5.177124    11.11663
                    --------------------------------------------------------------------------------------
                    
                    Multivariate imputation                     Imputations =       10
                    Chained equations                                 added =        1
                    Imputed: m=10                                   updated =        0
                    
                    Initialization: monotone                     Iterations =       10
                                                                    burn-in =       10
                    
                        mat_ment_hea~h: linear regression
                                parity: linear regression
                                folate: linear regression
                                   fad: linear regression
                                    VS: linear regression
                                    PL: linear regression
                                    CE: linear regression
                             ethnicity: logistic regression
                        pregnancy_pl~g: logistic regression
                                   SRI: logistic regression
                           DASS_stress: logistic regression
                             DASS_depr: logistic regression
                              DASS_anx: logistic regression
                        antidep_comb~d: logistic regression
                          post_smoking: logistic regression
                            breastfeed: multinomial logistic regression
                             parenting: multinomial logistic regression
                                   BMI: multinomial logistic regression
                                  diet: multinomial logistic regression
                                    ax: multinomial logistic regression
                    
                    ------------------------------------------------------------------
                                       |               Observations per m             
                                       |----------------------------------------------
                              Variable |   Complete   Incomplete   Imputed |     Total
                    -------------------+-----------------------------------+----------
                        mat_ment_hea~h |        649            3         3 |       652
                                parity |        644            8         8 |       652
                                folate |        646            6         6 |       652
                                   fad |        650            2         2 |       652
                                    VS |        474          178       178 |       652
                                    PL |        372          280       280 |       652
                                    CE |        372          280       280 |       652
                             ethnicity |        651            1         1 |       652
                        pregnancy_pl~g |        649            3         3 |       652
                                   SRI |        649            3         3 |       652
                           DASS_stress |        650            2         2 |       652
                             DASS_depr |        650            2         2 |       652
                              DASS_anx |        650            2         2 |       652
                        antidep_comb~d |        650            2         2 |       652
                          post_smoking |        586           66        66 |       652
                            breastfeed |        593           59        59 |       652
                             parenting |        637           15        15 |       652
                                   BMI |        631           21        21 |       652
                                  diet |        645            7         7 |       652
                                    ax |        648            4         4 |       652
                    ------------------------------------------------------------------
                    (Complete + Incomplete = Total; Imputed is the minimum across m
                     of the number of filled-in observations.)
                    
                    . 
                    . summarize
                    
                        Variable |        Obs        Mean    Std. dev.       Min        Max
                    -------------+---------------------------------------------------------
                        study_id |      5,642    4659.077    1748.223       1004       7227
                    mat_ment_h~h |      5,639    .6952676    .1728803   .1175857   1.129154
                    A6_axage_new |      5,631    7.337843    .5668842   5.946612   9.155373
                    antidep_co~d |      5,640    .0521277    .2223041          0          1
                             fad |      5,640     1.54567    .4527279   .5951965          4
                    -------------+---------------------------------------------------------
                    A6_PAE_tra~w |          0
                    A6_PAE_tie~w |          0
                       ethnicity |      5,641    .1526325    .3596648          0          1
                    traj_alcohol |      5,642    1.960475    1.740382          0          5
                              VS |      5,464    93.16169    16.83462          3   151.2634
                    -------------+---------------------------------------------------------
                              CE |      5,362    10.10246    2.901399  -1.124528   19.36325
                              PL |      5,362    10.31533    2.625464   1.775923    19.9746
                    pregnancy_~g |      5,639    .2202518    .4144532          0          1
                    post_smoking |      5,576    .2098278    .4072221          0          1
                             SRI |      5,639    .2000355    .4000621          0          1
                    -------------+---------------------------------------------------------
                    preeclampsia |      5,587    .0100233    .0996222          0          1
                          parity |      5,634    .6561579    .6900867  -1.172578    2.06418
                         preterm |      5,554    .0469932    .2116433          0          1
                             psy |      5,620    .1104982    .3135376          0          1
                          folate |      5,636    1.963074    1.036312  -.8975469   4.632943
                    -------------+---------------------------------------------------------
                             SGA |      5,510           0           0          0          0
                       child_sex |      5,642      .48972    .4999386          0          1
                      breastfeed |      5,583    .7954505    .8403993          0          2
                       parenting |      5,627    1.240626     1.20198          0          3
                    pren_smoking |      5,642    .1472882    .3544244          0          1
                    -------------+---------------------------------------------------------
                             BMI |      5,621    .6322718    .8376065          0          2
                            diet |      5,635    .9188997    .7942303          0          2
                              ax |      5,638    .3733593    .4837392          0          1
                            drug |      5,378    .0334697    .1798763          0          1
                     DASS_stress |      5,640    .2427305    .4287715          0          1
                    -------------+---------------------------------------------------------
                       DASS_depr |      5,640    .1874113     .390276          0          1
                        DASS_anx |      5,640    .1535461    .3605451          0          1
                           _mi_m |      5,642     4.86441    3.223388          0         10
                          _mi_id |      5,642    327.1567    187.9885          1        652
                        _mi_miss |        652    .7653374    .4241131          0          1

                    Discrepancies:
                    Code:
                          child |
                         mental |
                         health |      Freq.     Percent        Cum.
                    ------------+-----------------------------------
                              0 |        534       89.45       89.45
                              1 |         63       10.55      100.00
                    ------------+-----------------------------------
                          Total |        597      100.00
                    
                          
                           DASS |
                        anxiety |      Freq.     Percent        Cum.
                    ------------+-----------------------------------
                              0 |        508       85.09       85.09
                              1 |         35        5.86       90.95
                              2 |         54        9.05      100.00
                    ------------+-----------------------------------
                          Total |        597      100.00
                    
                          
                               parity |      Freq.     Percent        Cum.
                    ------------+-----------------------------------
                              0 |        280       47.38       47.38
                              1 |        235       39.76       87.14
                              2 |         76       12.86      100.00
                    ------------+-----------------------------------
                          Total |        591      100.00
                    
                          
                               folate |
                         intake |
                      pregnancy |      Freq.     Percent        Cum.
                    ------------+-----------------------------------
                              0 |         55        9.24        9.24
                              1 |        131       22.02       31.26
                              2 |        216       36.30       67.56
                              3 |        161       27.06       94.62
                              4 |         32        5.38      100.00
                    ------------+-----------------------------------
                          Total |        595      100.00

                    Comment


                    • #11
                      I don't know of any widely recognized rules of thumb. In my own limited experience, I have found that levels with fewer than 10 occurrences in the data are unworkable. And I have sometimes encountered difficulties even with several dozen occurrences. It's really more a matter of try it and see what happens.

                      As for when you can combine categories, that depends not on any statistics but on the meaning of the categories. You can combine categories if their meanings are sufficiently similar. So if you have a variable with levels corresponding to different species of animals, you might be willing to combine two different level that are both mammals, but not merge a mammal level with a bird level. Or perhaps you could merge a mammal level with a bird level if the other levels were all invertebrates. Basically ask yourself: if I combine these two levels of this variable, can I give a sensible name to the level. If you can, you can combine them. If not, you probably shouldn't.

                      Comment


                      • #12
                        Originally posted by Garance Delagneau View Post
                        I had a look at the descriptive statistics of my variables post imputation. My DVs now range from negative to positive, which is impossible given the nature of the task they had to do. Values can only be positive.
                        Although it is counter-intuitive, implausible imputed values are not necessarily a problem. In some situations (in which we impute interactions of variables), it seems that the implausible values actually produce valid results while more plausible values do not (Van Hippel 2009). You are using a linear regression model to impute the values. Predictions from a linear regression model are not bound. This is, technically speaking, why you are getting negative values. Now, I have no idea what your variable represents, but if it somehow measures the performance of "task they have to do", then the measure might well be truncated at zero, and, thus, is not able to differentiate well in the lower ranges of performance. In other words, your test might be flawed. From that perspective, the negative values could actually "correct" for the flaws of the test. Anyway, let's assume for now that the test is valid and you want the imputed values to be truncated at zero. You will then need to switch to a different imputation model; pmm or truncreg come to mind.

                        Originally posted by Garance Delagneau View Post
                        Also, everything is associated with everything, which is not expected. So I must have done something wrong.
                        This is stating the obvious, but if we discarded all data that contradicts our expectations, then there is no point in doing any analyses. I guess we agree on that. More to the point, multiple imputations do not actually create new associations; it recreates or "amplifies" existing associations in the data. This is also evident in the summary statistics pre- and post-imputation, which are very similar.

                        Addressing one of the valid points that Clyde raises, it is obviously true that rare events cause estimation problems, especially when combined with many categorical predictors. Sometimes, removing some of the categorical predictors results in a more reasonable constant and more stable estimation. Sometimes it does not. Clyde proposes to omit the respective variable. I did not completely understand whether that is merely a "practical" suggestion to get the model running or whether Clyde is worried about somehow biasing the results if we keep the variable. I have argued that omitting the variable might be worse than using the predictions that we are getting (98 out of 100 trails, in this example) because omitting the variable will inevitably lead to biased associations (towards zero). Admittedly, with 4 missing values (out of 698), the results are probably not much affected either way.

                        Clyde proposes another possible solution to the problem of non-convergence: combining categories. I would like to add that you can also combine categories of predictors in the models. You might even try to include ordered variables as continuous predictors in mlogit models that are difficult to estimate. Also, if the categories can be ordered, then you should switch the respective imputation models to ologit, which is sometimes more stable than mlogit, or even to pmm which is completely stable (yet considerably slows down the imputation in large samples). For example, if your DASS_ variables are psychological measures of anxiety, etc., those are almost certainly ordered.


                        Van Hippel, P. T. 2009. How to impute interactions, squares, and other transformed variables. Sociological Methodology, 39(1), 265--291.
                        Last edited by daniel klein; 11 Jul 2021, 00:48.

                        Comment


                        • #13
                          Fantastic, thank you so much Clyde and Daniel

                          Comment


                          • #14
                            Just one more detail: I have realized that you are imputing the DASS_* variables with a (binary) logit model. Given that those variables have at least three values (0-2), you certainly do not want a logit model there!

                            Comment

                            Working...
                            X