Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • What is the reference for excluding a covariate if there is many missing observation?

    Normally, I exclude a covariate out of the regression equation if there are many missing observation, let's say it is one-fifth less observation compared to other variables' observation in general (saying 80,000 compared to 100,000).

    But now, when reflecting back, I am wondering if there is any reference or explanation for excluding action like that? I think it may relate to the within-sample standard variation and explanation power due to the sample shrinking but I am not sure about that.

  • #2
    Phuc:
    in the second part of his outstanding teaching notes on how to deal with missing values , Richard Williams (I do think that his students are really lucky in having such a towering professor) states that " Some people say to not even consider MI unless at least 15% or 20% of your data is missing." (See https://www3.nd.edu/~rwilliam/stats2/l13.pdf, Warning box).
    Deciding whether or not 20% missing values is an issue for your analysis, depends on different considerations (and some of them are correlated):
    1) are we talking about available case analysis or complete case analysis?;
    2) is the missingness informative or not?;
    3) does the missingness affect relevant variableas you're investigating?;
    4) wil the target jiurnal you're going to submit your paper accept an analysis that simply rules out missing values?;
    5) does the budget of the reserach project include an extra for carrying out a missing data analysis?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Originally posted by Phuc Nguyen View Post
      Normally, I exclude a covariate out of the regression equation if there are many missing observation,
      (emphasis mine)

      Phuc is not talking about excluding missing observations, which would lead to available or complete case analyses; the issue is excluding variables!

      I think that there are very few -- if any -- situations in which you could justify excluding a covariate because of missing values. Either the covariate belongs into the model or it does not; I do not see how (the amount of) missing values enter that discussion.

      Comment


      • #4
        Daniel:
        thanks for correcting me.
        I was under the impression that the two issues were linked (as I've experienced in some projects; obviously, my experience can be anedoctal in this respect).
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Originally posted by daniel klein View Post
          (emphasis mine)

          Phuc is not talking about excluding missing observations, which would lead to available or complete case analyses; the issue is excluding variables!

          I think that there are very few -- if any -- situations in which you could justify excluding a covariate because of missing values. Either the covariate belongs into the model or it does not; I do not see how (the amount of) missing values enter that discussion.
          Thank you, Daniel, I got your point

          Comment

          Working...
          X