Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combined variables when using multiple imputation

    Dear all,
    I have studied all possible resources on how to deal with interaction terms and variables created as a function of other missing variables in MI.
    Apparently, there is no consensus on this topic and discussions are still ongoing. I am here seeking your latest opinion on this topic.

    In summary, there are around 4 ways to deal with interactions.
    • Passive approach: apparetly everyone is against this
    • JAV approach: preferable if the missing two variables used for the interaction are continuous only, right?
    • Impute datasets separately for each group if you have 1 variable used for the interaction that is binary/categorical and is not missing
    • Use the include() option. I was not aware of this way of doing things, but I saw that daniel klein has suggested it in this post.

    If I am able to use any of these approaches, which one would you recommand based on your experience on this topic?


    Second, for variables created as a function of other variables, I found 2 opposing opinions here by Clyde Schechter and Richard Williams.
    I have health scores that are created as a function of different other items. So, should I include both scores and items in the imputation model (JAV approach) or should I create the scores from imputed items (passive approach)?

    Thank you so much for any remarks on these two questions.

  • #2
    I wasn't aware of the include() option either. I'm curious how it works and how results with it differ from using JAV or passive imputation.

    For scales, I'm not sure that Clyde and I disagree that much. If you create a scale by adding 10 items together, it is conceivable that every case or almost every case could have missing data on the scale. It is also conceivable that only a few cases have missing data on the scale. It depends on the nature of the missing data in the scale items, e.g. if most cases have a little missing data but the variables they are missing data on differs widely across cases, then many cases will be missing on the scale and if you use JAV, you'll be throwing a huge amount of data away and JAV will be a poor choice.

    When in doubt, it is often a good idea to try it multiple ways and see if it makes any difference. If somebody disagrees with the strategy you chose, it is nice to show that, even if you did make a mistake, it didn't hurt you much.

    Of course, if different strategies produce strikingly different results, then you have to be really sure you have chosen the right one.

    I still stand by what I said in the post of mine that you cite. JAV is generally best. But when constructing scales, imputation of individual items and then using passive imputation to create a scale may be best. But having said that, I don't know enough about the include() option to say whether or not it is at least sometimes better.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 18.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment

    Working...
    X