Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Single imputation

    Hi experts,

    I try to use Stata to conduct single imputation for three variables. Two are continuous and one is binary. I chose single imputation because previous literature suggests so.
    I intend to use mi impute to conduct single imputation, because I cannot find any online resource on using Stata to do single imputation. All are about multiple imputation.
    I read that we need to impute multiple variables simultaneously, so I chose mi impute chained, because this is the only version of mi impute that seems to me to allow for imputing continuous and binary variables simultaneously. So my question is I want to know if the syntax I wrote as below is correct or not.

    mi set wide
    mi register imputed var1 var2 var3 // var1 and var2 are continuous; var3 is binary
    set seed 181123
    #delimit ;
    mi impute chained (pmm,knn(5)) var1 var2 (logit) var3 = depvar indepvars, add(1);
    #delimit cr

    Two potential issues:
    Am I correct in using add(1) in mi impute if my intention is to use mi impute to do single imputation? Or should I first produce, say 20 datasets (add(20)), and then take the average of them on var1, var2, and var3?
    Am I correct in relying on mi impute chained in performing single imputation for continuous and categorical variables simultaneously?

    I know my question is awkward. Many thanks in advance for any advice!

  • #2
    Originally posted by shem shen View Post
    I chose single imputation because previous literature suggests so.
    If this previous literature is rather old and your are not primarily interested in replicating a specific study, I suggest reconsidering this decision. Having said that, here is my take on your questions.

    Originally posted by shem shen View Post
    Am I correct in using add(1) in mi impute if my intention is to use mi impute to do single imputation? Or should I first produce, say 20 datasets (add(20)), and then take the average of them on var1, var2, and var3?
    You can do either. You could even randomly pick on of the 20 completed datasets. No matter what you do, your standard errors will be biased either way.

    Originally posted by shem shen View Post
    Am I correct in relying on mi impute chained in performing single imputation for continuous and categorical variables simultaneously?
    If the missing pattern is monotone you could impute the missing values separately; but mi is clever enough to handle this, so I usually do not even check and just fire up the chained approach.

    You will not be able to use mi estimate with only one imputed dataset. Make sure to exclude the original data before you run your analysis.

    Best
    Daniel

    Comment


    • #3
      Shem:
      as usual, Daniel gave excellend advice.
      Just an aside about your #1; you might be interested in getting a comprehensive picture of the evolution of methods for dealing with missing values (old-fashioned single imputation included) at: https://www.guilford.com/books/Missi.../9781593853938
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Thank you so much Daniel and Carlo! Your replies are super helpful!

        Comment

        Working...
        X