Single imputation

shem shen

Join Date: Mar 2016

Posts: 136
#1

Single imputation

23 Nov 2018, 22:55

Hi experts,

I try to use Stata to conduct single imputation for three variables. Two are continuous and one is binary. I chose single imputation because previous literature suggests so.
I intend to use mi impute to conduct single imputation, because I cannot find any online resource on using Stata to do single imputation. All are about multiple imputation.
I read that we need to impute multiple variables simultaneously, so I chose mi impute chained, because this is the only version of mi impute that seems to me to allow for imputing continuous and binary variables simultaneously. So my question is I want to know if the syntax I wrote as below is correct or not.

mi set wide
mi register imputed var1 var2 var3 // var1 and var2 are continuous; var3 is binary
set seed 181123
#delimit ;
mi impute chained (pmm,knn(5)) var1 var2 (logit) var3 = depvar indepvars, add(1);
#delimit cr

Two potential issues:
Am I correct in using add(1) in mi impute if my intention is to use mi impute to do single imputation? Or should I first produce, say 20 datasets (add(20)), and then take the average of them on var1, var2, and var3?
Am I correct in relying on mi impute chained in performing single imputation for continuous and categorical variables simultaneously?

I know my question is awkward. Many thanks in advance for any advice!
Tags: None
daniel klein

Join Date: Mar 2014

Posts: 3850
#2

24 Nov 2018, 00:37

Originally posted by shem shen View Post

I chose single imputation because previous literature suggests so.

If this previous literature is rather old and your are not primarily interested in replicating a specific study, I suggest reconsidering this decision. Having said that, here is my take on your questions.

Originally posted by shem shen View Post

Am I correct in using add(1) in mi impute if my intention is to use mi impute to do single imputation? Or should I first produce, say 20 datasets (add(20)), and then take the average of them on var1, var2, and var3?

You can do either. You could even randomly pick on of the 20 completed datasets. No matter what you do, your standard errors will be biased either way.

Originally posted by shem shen View Post

Am I correct in relying on mi impute chained in performing single imputation for continuous and categorical variables simultaneously?

If the missing pattern is monotone you could impute the missing values separately; but mi is clever enough to handle this, so I usually do not even check and just fire up the chained approach.

You will not be able to use mi estimate with only one imputed dataset. Make sure to exclude the original data before you run your analysis.

Best
Daniel
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17709
#3

24 Nov 2018, 02:49

Shem:
as usual, Daniel gave excellend advice.
Just an aside about your #1; you might be interested in getting a comprehensive picture of the evolution of methods for dealing with missing values (old-fashioned single imputation included) at: https://www.guilford.com/books/Missi.../9781593853938

Kind regards,
Carlo
(Stata 19.0)
Comment
shem shen

Join Date: Mar 2016

Posts: 136
#4

24 Nov 2018, 11:38

Thank you so much Daniel and Carlo! Your replies are super helpful!
Comment

Announcement

Comment

Comment

Comment