Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Individuals in suvery contacted only in later waves: missing values

    Dear all,

    I have a panel data with missing values. There are different reasons for missingness.
    One particular reason is because some participants were contacted only since the 2nd, 3rd or a later wave, so they were unaware of the survey in the earlier waves.
    While some participants that joined the survey in one wave would drop out after a few waves or skip one or two waves before coming back in later waves.

    I want to use multiple imputation for those that participated in the first waves then dropped out or skipped some later waves.

    My question: What should I do with missing values for participants that joined only in later waves because they were contacted only then. I believe these should be considered as MCAR. Is it really possible to do imputation under such condition? or should I just impute all missing values no matter the reason for missingness?

    Best

  • #2
    It does not seem reasonable to use multiple imputation (MI) to account for the full set of variables in a given wave for an individual, especially attempting to impute values for individuals before they entered the survey. Baltagi's panel data econometrics book includes a chapter on the estimation of rotating panels, which are typical of survey panels where the same individual may not have been interviewed repeatedly. This approach allows one to measure the extent of rotation group bias under certain assumptions. If you still choose to use MI, impute forwards from the first time an individual entered the survey, not backwards.

    Reference
    Baltagi, B. H. (2013). Econometric Analysis of Panel Data. John Wiley & Sons.

    Comment


    • #3
      Thank you so much Andrew Musau for you quick and clear answer. I have never known that recruiting new participants in different waves of a survey was referred to as rotating panels. This is interesting. I will try to look up the reference you mentionned.

      In fact, the only reason I am using MI to impute missing values is because I have some doubts that the values of the outcome variables (health related) are driving people to drop out of the sample (However, we cannot be sure, they may also drop out because of other reasons such as not interested anymore in the survey or finding the questionnaire very long, etc).

      So, I was going to impute both unit non-response and item non-response.
      I understand your concern about imputing a whole set of variable for a given wave and individual, but I was going to do it only for missing outcome variables as I am including in the model only an interaction of 2 variables that are at the neighborhood level and are fully observed + the fixed effects. So I was thinking it will be ok to impute unit non-responses. Please let me know your opinion on this.

      I took a peak on what are the rotating panels and it turns out they are used to replace leaving participants of the survey. However, as we still do not know if the newly recruited people are representative of the ones that left (particularly as we do not know what would have been their outcomes had they had not left the survey), rotating panels do not really solve the problem of missing values because of attrition, right?

      One last question please: so your worry about using MI was only because it would be unreasonable to impute the full set of variables in a given wave for an individual and is not related to the fact of having rotating panels, right?
      If yes, your suggested solution is the use of forward imputation, right?

      Comment


      • #4
        rotating panels do not really solve the problem of missing values because of attrition, right?
        Correct. It's just a method of handling the attrition and testing whether the rotation introduces bias. At the end, what we are really interested in is to get consistent estimates of the parameters of interest.


        One last question please: so your worry about using MI was only because it would be unreasonable to impute the full set of variables in a given wave for an individual and is not related to the fact of having rotating panels, right?
        Correct. MI assumes that data are missing at random (MAR). However, the degree of missingness matters. If you are imputing complete waves for half of the individuals in the survey, then MI may not be as helpful.

        If yes, your suggested solution is the use of forward imputation, right?
        If you choose to use MI, that's my recommendation. Imputing data backwards may lead to situations where individuals did not satisfy certain conditions for being in the survey, such as being too young or living in a different region. It would be challenging to convince someone reasonable of the logic behind such an imputation.

        Comment

        Working...
        X