Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • MICE for single imputation

    Hello,

    I am wondering if you can use multiple imputation for chained equation (MICE) to just perform a single imputed dataset in Stata. I think the command I am using for analysis is not compatible with multiple imputation (the commmand I am using is gllamm, a user generated command for multilevel models). So, is it reasonable to run just one imputation in MICE (see code below) and then run my analysis how I normally would (i.e., not using the mi estimate command)? I choose MICE for imputation because I have multiple variable types that need to be imputed (binary, continuous, and ordinal/categorical) and it seemed to be the correct imputation option to choose given my variety of variable types.

    I am open to other single imputation suggestions is anyone has them.

    Code:
    mi set mlong;
    xi: mi register imputed Lr2_number_adults_ LSize_Cat_ Lr_Do_you_own_;
    
    xi: mi register regular Garden_Active_ i.Year Garden_ID LSite_Visit_Curr_or_Prior_ LSold_GID_ LPickups_ LUR_Curr_Yr_or_Prior_ LSOD_Curr_or_Prior_ LKGD_Curr_Or_Prior_ LCommunity_Garden_ LMarket_Garden_ LYr_Act_Prior_ LSoil_Test_Curr_or_Prior_ i.r_L_classes i.r_L_volunteer_3_max i.r_L_social_2_max;
    xi: mi impute chained (logit) Lr_Do_you_own_ (regress) Lr2_number_adults_ (ologit) LSize_Cat_ = Garden_Active_ i.Year Garden_ID LCommunity_Garden_ LMarket_Garden_ LSite_Visit_Curr_or_Prior_ LSold_GID_ LPickups_ LUR_Curr_Yr_or_Prior_ LSOD_Curr_or_Prior_ LKGD_Curr_Or_Prior_ LYr_Act_Prior_ LSoil_Test_Curr_or_Prior_ i.r_L_classes i.r_L_volunteer_3_max i.r_L_social_2_max, add(1);

    Many thanks,
    Alyssa
    Last edited by Alyssa Beavers; 20 Feb 2019, 16:25. Reason: somehow an emoji showed up in my post

  • #2
    Technically, this is possible to create just one complete dataset. However, just running your gllamm command (without mi estimate) will not produce the results you expect unless you restrict gllamm to use the imputed observations, that is, the complete dataset, only; the latter is not possible in mlong style. Even if you manage to include such restriction, your approach does not account for the uncertainty associated with imputed values, which is what multiple imputation is all about.

    I have not used gllamm a lot, but you could probably use mi estimate's cmdok option to force it to work with multiply imputed data. You might not get all the auxiliary parameters but the coefficients and standard errors should be combined in the correct way.

    Before proceeding, make sure you really need gllamm. Official Stata has added many features in the area of multilevel modeling since release 13 or so. So I wonder whether you can get what you want using official Stata commands, which might or might not work easier with mi; even if it only allows you to get rid of those xi calls (which make me a bit nervous when combined with mi).

    A last, yet rather important issue to consider is the multilevel nature of your data, which should be reflected during the imputation process. Imputing multilevel data is not at all straightforward and an area of ongoing research. However, it seems to be known that ignoring the nested structure and imputing the data in the same way you would impute independent observations (so-called flat-file imputation) will likely lead to biased results.

    Best
    Daniel
    Last edited by daniel klein; 21 Feb 2019, 00:02.

    Comment


    • #3
      Hi Daniel,

      Many thanks for your reply. What you said brings up another question: how do you consider the nested structure when imputing the data? I have read that you can impute separately for each cluster if you have a small number of clusters. However in my case, my nested structure is repeated measures on a unit of observation so I have about 2300 clusters.

      Also, I am curious what makes you nervous about using xi with multiple imputation?

      Lastly, I did try the cmdok option with gllamm. It did end up working but took a very long time, which is why I thought it didn't work initially.

      thanks,
      Alyssa

      Comment


      • #4
        there is a FAQ dealing with your question re: nested structure and MI:
        https://www.stata.com/support/faqs/s...and-mi-impute/

        Comment


        • #5
          Alyssa,

          ximanually creates dummy variables for each category. It's old syntax. Almost all current Stata commands don't require the xiprefix for categorical variables - we can use the factor variable syntax instead. I am not sure exactly why Daniel is nervous about using xiwith mi impute, but I share his concern. You could just type

          Code:
          mi impute chained (logit) Lr_Do_you_own_ (regress) Lr2_number_adults_ (ologit) LSize_Cat_ = Garden_Active_ i.Year Garden_ID LCommunity_Garden_ LMarket_Garden_ LSite_Visit_Curr_or_Prior_ LSold_GID_ LPickups_ LUR_Curr_Yr_or_Prior_ LSOD_Curr_or_Prior_ LKGD_Curr_Or_Prior_ LYr_Act_Prior_ LSoil_Test_Curr_or_Prior_ i.r_L_classes i.r_L_volunteer_3_max i.r_L_social_2_max, add(1)
          and Stata will run that command fine. You can then use xi later on, if the command requires it, without issue - I've done it myself.

          I will reiterate that you may not need gllamm. You definitely do not need it if you are running a hierarchical model and you're running Stata 13 or later - and maybe even Stata 11 or 12, but I forget what commands are included with those versions. The mixed command doesn't require the xi prefix.
          Last edited by Weiwen Ng; 21 Feb 2019, 08:51.
          Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

          When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

          Comment


          • #6
            Originally posted by Alyssa Beavers View Post
            Also, I am curious what makes you nervous about using xi with multiple imputation?
            I have not worked with xi in a long time and do not really remember when it creates new variables, drops variables and re-creates new variables again. mi is a complex machinery, which creates temporary variables and datasets behind the scenes and I am not sure whether xi is getting the job done correctly. Judging from your syntax

            Code:
            xi: mi register imputed Lr2_number_adults_ LSize_Cat_ Lr_Do_you_own_;
            which lacks any terms that xi might process, I guess that your understanding of this prefix is probably not much better than mine. Therefore, I assume you are not sure whether it works as expected, either. There might not be a problem, but I would stick with factor variable notation, just to be on the safe side.

            Best
            Daniel

            Comment


            • #7
              Hi all,

              I tried xtlogit instead of gllamm on my non-imputed dataset, and the results were identical. So, seems like xtlogit is an easier approach (and it's actually faster, I had thought it was supposed to be slower, but this is not the case). Rich Goldstein : with regards to the link you shared--thank you, but I'm having trouble making these examples applicable to me. I have multiple variables of different types to impute (categorical, binary, and continuous). I think this means I need to use mi chained (that's what I understood from the manual at least: if you have multiple variables to impute , you need a command that can do them at the same time, and also I think this is the only command that can handle different variables types with imputed data).

              Someone asked for a dataex of my data, and you may find it below.

              Many thanks to all who have responded.

              input int(Garden_ID Year) byte Garden_Active_ float(Lr2_number_adults_ Lr_Do_you_own_ LSize_Cat_ r_L_volunteer_3_max LFamily_Garden_ LSchool_Garden_ LCommunity_Garden_)
              4722 2013 0 30 0 3 0 0 0 1
              5754 2015 1 2 0 . 0 1 0 0
              2907 2013 1 30 1 4 3 0 0 1
              2891 2013 0 2 0 1 0 1 0 0
              4605 2013 0 1 0 1 0 1 0 0
              4894 2014 0 2 1 2 0 0 0 1
              1964 2013 1 1 0 1 0 1 0 0
              5645 2015 1 2 0 1 0 1 0 0
              5935 2015 0 1 0 1 0 1 0 0
              2707 2013 0 1 0 . 0 1 0 0
              3175 2014 0 2 0 1 0 0 0 1
              5290 2014 1 1 1 2 3 1 0 0
              3168 2014 0 . 0 . 0 1 0 0
              4831 2013 0 5 1 2 0 1 0 0
              3391 2014 0 2 0 1 0 1 0 0
              4829 2013 1 2 0 1 0 1 0 0
              3027 2013 1 3 . 1 0 1 0 0
              2054 2013 1 2 1 3 0 1 0 0
              5290 2015 1 2 1 2 3 1 0 0
              4829 2014 1 2 0 1 1 1 0 0
              5644 2015 1 10 1 3 1 0 0 1
              3872 2014 1 126 1 5 0 0 0 1
              6013 2015 1 7 0 . 0 0 0 1
              3082 2014 1 2 0 2 0 1 0 0
              3024 2013 1 3 0 5 0 1 0 0
              4796 2015 1 10 . 2 0 0 0 1
              2223 2015 1 5 1 3 0 0 0 1
              5128 2013 1 2 0 1 0 1 0 0
              2907 2014 1 15 1 4 0 0 0 1
              4974 2015 1 2 1 5 0 0 0 1
              4173 2015 1 10 0 1 1 0 0 1
              5886 2015 1 2 0 1 1 1 0 0
              17 2015 1 6 0 . 0 0 0 1
              4851 2014 1 6 0 5 0 0 0 1
              5514 2014 0 2 0 1 0 1 0 0
              1964 2015 1 2 0 1 0 1 0 0
              2054 2015 1 6 1 3 0 1 0 0
              3872 2013 1 172 1 5 0 0 0 1
              5497 2014 0 . 0 1 0 1 0 0
              4680 2013 0 2 1 2 0 1 0 0
              4974 2014 1 3 1 4 0 0 0 1
              5672 2015 1 2 0 2 0 1 0 0
              5630 2015 1 2 0 . 0 1 0 0
              3024 2015 0 2 0 1 0 1 0 0
              4173 2013 1 5 . 1 0 0 0 1
              5663 2015 1 16 0 1 0 0 0 1
              2163 2015 1 4 0 4 0 1 0 0
              4974 2013 1 3 1 4 0 0 0 1
              2163 2013 1 2 1 1 0 1 0 0
              4173 2014 1 4 1 1 2 0 0 1
              3669 2015 0 1 1 1 1 1 0 0
              4796 2013 1 10 1 1 0 0 0 1
              5795 2015 1 2 0 2 0 1 0 0
              5128 2015 1 2 0 1 0 1 0 0
              4796 2014 1 10 1 2 0 0 0 1
              3624 2015 1 1 1 2 0 1 0 0
              5521 2014 0 2 1 1 0 1 0 0
              2781 2013 1 4 1 . 0 0 0 1
              4399 2014 0 15 0 5 0 0 0 1
              3027 2014 1 3 0 1 0 1 0 0
              3028 2014 0 2 0 1 0 1 0 0
              3984 2013 0 8 0 3 0 0 0 1
              5771 2015 0 2 1 1 0 1 0 0
              4579 2013 0 1 0 1 0 0 0 1
              5979 2015 0 2 0 . 0 1 0 0
              2781 2015 1 6 1 3 0 0 0 1
              2781 2014 1 5 1 . 0 0 0 1
              4829 2015 0 2 0 1 0 1 0 0
              4851 2013 1 17 . 5 0 0 0 1
              3669 2013 1 2 1 1 1 1 0 0
              12029 2015 0 5 . . 0 1 0 0
              4159 2013 0 3 1 5 0 1 0 0
              17 2013 1 6 1 . 1 0 0 1
              4399 2013 1 32 0 5 1 0 0 1
              4253 2015 0 . . . 0 0 0 1
              5148 2013 0 . 0 . 0 0 0 1
              2163 2014 1 4 0 4 0 1 0 0
              6061 2015 0 2 0 1 0 0 0 1
              3175 2013 1 100 0 1 0 0 0 1
              5832 2015 1 6 0 . 0 0 0 1
              2211 2013 0 3 . 3 0 0 0 1
              3007 2014 0 3 0 . 1 1 0 0
              3865 2014 0 . 0 . 0 0 0 0
              3027 2015 1 1 0 1 0 1 0 0
              4851 2015 1 10 0 5 0 0 0 1
              12 2015 1 12 0 2 0 0 0 1
              17 2014 1 5 1 . 2 0 0 1
              3669 2014 1 2 1 2 0 1 0 0
              758 2015 0 . 0 1 0 0 0 1
              3082 2013 1 3 0 2 2 1 0 0
              5177 2014 0 4 0 1 0 0 0 1
              3007 2013 1 3 0 . 0 1 0 0
              3024 2014 1 2 0 4 0 1 0 0
              1964 2014 1 1 0 1 0 1 0 0
              12 2014 1 3 1 2 0 0 0 1
              2054 2014 1 4 1 2 1 1 0 0
              2223 2014 1 2 1 3 0 0 0 1
              5128 2014 1 2 0 1 0 1 0 0
              2223 2013 1 5 . 4 0 0 0 1
              3391 2013 1 1 0 1 0 1 0 0
              end
              [/CODE]

              Comment


              • #8
                Originally posted by Alyssa Beavers View Post
                Rich Goldstein : with regards to the link you shared--thank you, but I'm having trouble making these examples applicable to me. I have multiple variables of different types to impute (categorical, binary, and continuous). I think this means I need to use mi chained
                With the exception of the third approach, you can just plug in mi chained in place of mi regress. Note that the focus of the linked site is on the conceptual level of the data structure and not on the specific commands you use (technically).

                Best
                Daniel

                Comment

                Working...
                X