mi impute chained with survey data

Steve Samuels

Join Date: Mar 2014

Posts: 1786
#16

31 Jul 2014, 10:44

Russel, I apologize for misreading the statement on the Wisconsin web page and also for misunderstanding Marilena's original question.

Steve

Last edited by Steve Samuels; 31 Jul 2014, 11:16.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
marilena komodromou

Join Date: Jul 2014

Posts: 11
#17

04 Aug 2014, 03:50

Dear all,

Thank you for your replies!

I would like to ask you however why you do think that adjusting for survey design is not useful. in my case the survey has oversampled ethnic minorities and disadvantaged groups, Pweights only cover attrition not item non-response which is what is causing my missing data.

As for your suggestion do you mean separate by strata and PSU and then perform the mi chained?

thank you for your help
Comment
Russell Dimond

Join Date: May 2014

Posts: 5
#18

05 Aug 2014, 11:28

Hi Marilena,

This issue of weighting an imputation model is not one I've looked into in any depth, so I'll leave the question of whether or not to do it to others. But if you want to run separate imputation models for each combination of strata and PSU (which is one of several suggestions I've seen for how to incorporate them into an imputation model) all you need to do is add by(strata PSU) to the end of your mi impute chained command.

Russell Dimond
Statistical Computing Specialist
Social Science Computing Cooperative
University of Wisconsin-Madison
Comment
marilena komodromou

Join Date: Jul 2014

Posts: 11
#19

06 Aug 2014, 06:29

thank you Russell
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#20

06 Aug 2014, 07:12

I had a theoretical reason for thinking that sample weighting of the imputation process was wrong: that prediction was intended for the particular sample, not for the population. But upon looking at the literature, I find I was wrong.

MI variance estimators can be biased if survey weights are not used in the imputation model and sampling is "informative". The situation is worst for domain analyses, if the domain definition is not also a predictor in the model. A sampling domain is a non-stratum subgroup for which separate analyses are required; the Stata term is sub-populations.) This situation was first exposed for the case of estimating a domain mean by Kott, 1995. See the Introduction to Reist and Larsen (2012) for a brief summary.

So, weights should be incorporated into the imputation model. However, weighting the model (e.g. weight option in mi impute), the solution for Kott's simplified problem, does not appear to be the best approach. Rather, the recommendation of Carpenter (2011) and others is to use the weights, first grouped, as main effect predictors and as components of interaction terms. A preferable alternative, if available, is to incorporate into the model other variables that determine the weights. For example, in the Georgia Reproductive Health Survey (Serbanescu, 2011), selection probabilities differed by geographical stratum and by number of females in the household eligible for the survey. These factors could enter directly into an imputation model.

One approach to implementing Russell's suggestion is based on Reiter et al. (2006), who state:

In some surveys the design may be so complicated that it is impractical to include dummy variables for every cluster. In these cases, imputers can simplify the model for the design variables, for example collapsing cluster categories or including proxy variables (e.g., cluster size) that are related to the outcome of interest.

Thus, you could separately impute in subgroups formed by these variables.

Steve

References:

Carpenter, James R. 2011. Multiple imputation with survey weights‚a bad idea?
http://www.ccsr.ac.uk/qmss/seminars/...Carpenter.pdf.

Kim, Jae Kwang, Brick Michael, J, Wayne A Fuller, and Graham Kalton. 2006. On the bias of the multiple-imputation variance estimator in survey sampling. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68, no. 3: 509-521.
http://jkim.public.iastate.edu/2006_JRSSB.pdf

Kott, PS. 1995. A paradox of multiple imputation. Proceedings of the Section on Survey Research Methods 384-389.
http://www.amstat.org/sections/srms/...s/1995_064.pdf

Reist, BM, and Larsen, MD. 2012. Post-Imputation Calibration Under Rubin’s Multiple Imputation Variance Estimator. Section on Survey Research Methods, Joint Statistical Meeting 3924-3934.
https://www.amstat.org/sections/srms/proceedings/y2012/files/304603_73257.pdf

Reiter, Jerome P, Trivellore E Raghunathan, and Satkartar K Kinney. 2006. The importance of modeling the sampling design in multiple imputation for missing data. Survey Methodology 32, no. 2: 143.

http://publications.gc.ca/collection...02.pdf#page=29

Serbanescu F, Egnatashvili V, Ruiz A, Suchdev D, Goodwin M (2011): Reproductive Health Survey Georgia, 2010 Summary Report. Division of Reproductive Health, Centers for Disease Control and Prevention (DRH/CDC) Atlanta, Georgia USA.

Last edited by Steve Samuels; 06 Aug 2014, 08:00.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Okechukwu Anyamele

Join Date: Apr 2014

Posts: 1
#21

19 Mar 2019, 22:27

i need help with repeated-imputation inference program to work on survey of consumer finances. The code I have is
rii , imp(Y): regress X1 X2 X3 X4 X101, robust
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment